Cytochrome p450 and cytochrome p450 reductase polypeptides, encoding nucleic acid molecules and uses thereof

ABSTRACT

Provided are cytochrome P450 polypeptides, including cytochrome P450 santalene oxidase polypeptides, cytochrome P450 bergamotene oxidase polypeptides and cytochrome P450 reductase polypeptides. Also provided are nucleic acid molecules encoding the cytochrome P450 polypeptides. Cells containing the nucleic acids and/or the polypeptides are provided as are methods for producing terpenes, such as santalols and bergamotols, by culturing the cells.

RELATED APPLICATIONS

Benefit of priority is claimed to U.S. Provisional Application Ser. No.61/796,129, filed Nov. 1, 2012, entitled “CYTOCHROME P450 AND CYTOCHROMEP450 REDUCTASE POLYPEPTIDES, ENCODING NUCLEIC ACID MOLECULES AND USESTHEREOF” and to U.S. Provisional Application Ser. No. 61/956,086, filedMay 31, 2013, entitled “CYTOCHROME P450 AND CYTOCHROME P450 REDUCTASEPOLYPEPTIDES, ENCODING NUCLEIC ACID MOLECULES AND USES THEREOF.” Thesubject matter of each of the above-noted applications is incorporatedby reference in its entirety.

INCORPORATION BY REFERENCE OF SEQUENCE LISTING PROVIDED ELECTRONICALLY

An electronic version of the Sequence Listing is filed herewith, thecontents of which are incorporated by reference in their entirety. Theelectronic file is 301 kilobytes in size, and titled 229SEQPC1.txt.

FIELD OF THE INVENTION

Provided are cytochrome P450 santalene oxidases, cytochrome P450bergamotene oxidases and cytochrome P450 reductases, nucleic acidmolecules encoding the P450 santalene oxidases, cytochrome P450bergamotene oxidases and cytochrome P450 reductases, and methods forproducing products whose synthesis includes reactions catalyzed by thecytochrome P450 santalene and bergamotene oxidases. Included among theproducts are santalols and bergamotols and precursors and derivativesthereof.

BACKGROUND

Sandalwood (Santalum album) is a slow-growing hemi-parasitic tropicaltree of great economic value found growing in southern India, Sri Lanka,eastern Indonesia and northern Australia. The timber is highly soughtafter for its fine grain, high density and excellent carving properties.Sandalwood heartwood has a unique fragrance imparted by the resins andessential oils, including santalols, santalenes and othersesquiterpenoids, in the heartwood. In general, Santalum album heartwoodcontains up to 6% dry weight sesquiterpene oils. Sandalwood oilpredominantly contains the sesquiterpene alcohols α-santalol,β-santalol, Z-α-trans-bergamotol and epi-β-santalol, and additionallyincludes α-santalene, β-santalene, α-bergamotene, epi-β-santalene,β-bisabolene, α-curcumene, β-curcumene and γ-curcumene. Sandalwood oilhas a soft, sweet-woody and animal-balsamic odor that is imparted fromthe terpenoid β-santalol and is highly valued. Sandalwood oil has beenobtained by distillation of the heartwood of Santalum species and isused as a perfume ingredient, in incenses and traditional medicine andin pesticides.

Centuries of over-exploitation has led to the demise of sandalwood innatural stands. Large plantations are being established throughoutnorthern Australia to satisfy demand and conserve remaining reserves. Inaddition, there is great variation in the amount of heartwood oilproduced, even under near-identical growing conditions, due to geneticand environmental factors, such as climate and local conditions.Generally, the price and availability of plant natural extracts dependupon the abundance, oil yield and geographical origins of the plants.

Although chemical approaches to generate santalols and the othersesquiterpenoids in sandalwood oil have been attempted, the highlycomplex structures of these compounds have rendered economically viablesynthetic processes for their preparation in large quantitiesunattainable. Thus, there is a need for efficient, cost-effectivesyntheses of santalols and other sesquiterpenoids that impart the highlysought after sandalwood fragrance for use in the fragrance industry.

Thus, among the objects herein, is the provision of methods for theproduction of santalols and other sesquiterpenoids and the resultingproducts of the methods.

SUMMARY

Provided herein are nucleic acid molecules encoding cytochrome P450polypeptides or catalytically active fragments thereof and the encodedpolypeptides, and host cells containing such nucleic acid molecules orencoded polypeptides. For example, the encoded cytochrome P450polypeptide or catalytically active fragment or portion thereof exhibitsat least 65%, 70%, 75%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%,89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% sequence identityto SEQ ID NO:50, such as at least 90% sequence identity to SEQ ID NO:50.Also provided are nucleic acid molecules encoding cytochrome P450reductase polypeptides or catalytically active fragments thereof and theencoded polypeptides, and host cells containing such nucleic acidmolecules or encoded polypeptides. For example, the encoded cytochromereductase polypeptide or catalytically active fragment thereof exhibitsat least 80%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%,96%, 97%, 98% or 99% sequence identity to a cytochrome P450 reductasepolypeptide set forth in SEQ ID NO:12 or 13, such as at least 90%sequence identity to a cytochrome P450 reductase polypeptide set forthin SEQ ID NO:12 or 13. Any of the nucleic acid molecules provided hereincan be cDNA or can be an isolated or purified nucleic acid molecule.Among the nucleic acid molecules and polypeptides provided herein are aset of CYP450s that exhibit santalene/bergamotene oxidase activity,which provide for, among other things, metabolic engineering ofsandalwood oil biosynthesis, improvement of sandalwood plantations, andconservation of native sandalwood forests.

In particular, among the host cells provided herein are host cells thatare engineered to contain heterologous nucleic acid encoding any of thecytochrome P450 polypeptides provided herein, whereby the host cells arecapable of producing one or more of α-santalol from α-santalene,β-santalol from β-santalene, epi-β-santalol from epi-β-santalene andα-trans-bergamotol from α-trans-bergamotene, such as one or more of(E)-α-santalol, (Z)-α-santalol, (E)-β-santalol, (Z)-β-santalol,(E)-epi-β-santalol, (Z)-epi-β-santalol, (Z)-α-trans-bergamotol or(E)-α-trans-bergamotol. For example, the host cells are also engineeredto also contain a santalene synthase as described herein to produce asantalene and/or bergamotene terpene substrate of the encoded cytochromeP450 polypeptide. The host cells also can be engineered to also containheterologous nucleic acid encoding a cytochrome P450 reductase, such asany provided herein. The host cells is a prokaryotic cell or aneukaryotic cell, such as a bacteria, yeast, insect, plant or mammaliancell. For example, the host cell is a Saccharomyces genus cell, a Pichiagenus cell or an Escherichia coli cell. In particular examples herein,the host cell is a Saccharomyces cerevisiae cell. The host cell producesor is modified to produce or overexpress an acyclic pyrophosphateterpene precursor, such as farnesyl diphosphate.

For example, provided herein are isolated Santalum album cytochrome P450polypeptides or catalytically active fragments thereof, includingcytochrome P450 santalene oxidases or catalytically active fragmentsthereof and cytochrome P450 bergamotene oxidases or catalytically activefragments thereof. Also provided herein are nucleic acid moleculesencoding the cytochrome P450 santalene oxidases and cytochrome P450bergamotene oxidases or catalytically active fragments thereof. Alsoprovided are modified forms thereof.

Also provided are nucleic acid molecules encoding cytochrome P450reductase polypeptides, including modified cytochrome P450 reductasepolypeptides. Provided herein are isolated Santalum album cytochromeP450 reductase polypeptides, and host cells containing the polypeptides,where the polypeptides are heterologous to the host cell. Provided arenucleic acid molecules encoding a fusion protein containing a cytochromeP450 enzyme and a second moiety such as a synthase or catalyticallyactive portion thereof.

Also provided are nucleic acid molecules encoding fusion proteinscontaining a Santalum album santalene synthase and/or a cytochrome P450santalene oxidase or bergamotene oxidase and/or a cytochrome P450reductase, or catalytically active fragments of any of the enzymes.Exemplary of the nucleic acid molecules encoding fusion proteins arenucleic acid molecules encoding a fusion protein containing: a santalenesynthase and a cytochrome P450 santalene oxidase; a santalene synthaseand a bergamotene oxidase; a cytochrome P450 santalene oxidase and acytochrome P450 reductase; and a cytochrome P450 bergamotene oxidase anda cytochrome P450 reductase or catalytically active fragments of any thepreceding enzymes. The encoded proteins and host cells containing thenucleic acids and/or the proteins are provided.

Also provided herein are methods for producing any of the encodedcytochrome P450 polypeptides or catalytically active fragments thereof,including methods for producing a cytochrome P450 reductase polypeptide.Also provided herein are methods for production of a santalol,bergamotol and/or mixtures thereof by contacting the cytochrome P450santalene oxidases and/or cytochrome P450 bergamotene oxidases with asubstrate therefor from which these products are produced. The methodscan be performed in vitro with isolated reagents or partially isolatedreagents or in vivo in a host cell that encodes the enzymes, andoptionally a synthase and/or other substrate.

For example, provided herein are isolated Santalum album cytochrome P450santalene oxidases or catalytically active fragments thereof. Theprovided isolated Santalum album cytochrome P450 santalene oxidasescatalyze the hydroxylation or monooxygenation of santalene and/orbergamotene. In one example, the provided isolated Santalum albumcytochrome P450 santalene oxidases catalyze the formation of a santalolfrom a santalene and/or a bergamotol from a bergamotene. For example,the isolated Santalum album cytochrome P450 santalene oxidases catalyzethe formation of α-santalol from α-santalene, β-santalol fromβ-santalene, epi-β-santalol from epi-β-santalene and/orZ-α-trans-bergamotol from α-trans-bergamotene. For example, the isolatedSantalum album cytochrome P450 santalene oxidases catalyze the formationof (E)-α-santalol, (Z)-α-santalol, (E)-β-santalol, (Z)-β-santalol,(E)-epi-β-santalol, (Z)-epi-β-santalol, (Z)-α-trans-bergamotol or(E)-α-trans-bergamotol. Also provided herein are isolated cytochromeP450 santalene oxidases that are members of the CYP76 family.

Provided herein are isolated nucleic acid molecules encoding a Santalumalbum cytochrome P450 santalene oxidase polypeptide or a catalyticallyactive fragment thereof. For example, provided herein are isolatednucleic acid molecules (and host cells containing the nucleic acidmolecules, which are heterologous to the host cells) encoding acytochrome P450 santalene oxidase polypeptide having a sequence of aminoacids set forth in SEQ ID NO:7, 74, 75, 76 or 77; or a cytochrome P450santalene oxidase polypeptide having a sequence of amino acids that hasat least 96% sequence identity to a cytochrome P450 santalene oxidasewhose sequence is set forth in SEQ ID NO:7, 74, 75, 76 or 77. In anotherexample provided herein are isolated nucleic acid molecules encoding acytochrome P450 santalene oxidase polypeptide having a sequence of aminoacids that has at least 50% sequence identity to a cytochrome P450santalene oxidase polypeptide set forth in SEQ ID NO:7, 74, 75, 76 or77. The cytochrome P450 santalene oxidase polypeptide catalyzes thehydroxylation or monooxygenation of santalene and/or bergamotene. Forexample, the encoded cytochrome P450 santalene oxidase polypeptideexhibits at least 55%, 60%, 65%, 70%, 75%, 80%, 85%, 86%, 87%, 88%, 89%,90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% amino acid sequenceidentity to a sequence of amino acids set forth in SEQ ID NO:7, 74, 75,76 or 77.

Also provided herein are isolated nucleic acid molecules encoding acytochrome P450 santalene oxidase or a catalytically active fragmentthereof selected from among nucleic acid molecules having a sequence ofnucleic acids set forth in SEQ ID NO:3, 68, 69, 70 or 71; a sequence ofnucleic acids having at least 98% sequence identity to a sequence ofnucleic acids set forth in SEQ ID NO:3, 68, 69, 70 or 71; anddegenerates thereof. In a particular example, the isolated nucleic acidmolecule has the sequence of nucleotides set forth SEQ ID NO:3, 68, 69,70 or 71. In some examples, the isolated nucleic acid molecules encode acytochrome P450 santalene oxidase polypeptide having a sequence of aminoacids set forth in SEQ ID NO:7, 74, 75, 76 or 77. The provided isolatednucleic acid molecules encode cytochrome P450 santalene oxidasepolypeptides that catalyze the formation of a santalol, such as aα-santalol, β-santalol or epi-β-santalol, from a santalene, such as aα-santalene, β-santalene or epi-β-santalene, and/or catalyze thehydroxylation or monooxygenation of santalene. In some examples, theencoded cytochrome P450 santalene oxidase polypeptide catalyzes theformation of Z-α-trans-bergamotol from α-trans-bergamotene. Alsoprovided herein are cytochrome P450 santalene oxidase polypeptidesencoded by any of the isolated nucleic acid molecules provided herein.

For example, provided herein are isolated Santalum album cytochrome P450bergamotene oxidases or catalytically active fragments thereof. Theprovided isolated Santalum album cytochrome P450 bergamotene oxidases orcatalytically active fragments thereof catalyze the hydroxylation ormonooxygenation of bergamotene and/or catalyze the formation of abergamotol from a bergamotene. For example, the isolated Santalum albumcytochrome P450 bergamotene oxidases catalyze the formation ofZ-α-trans-bergamotol or (E)-α-trans-bergamotol from α-trans-bergamotene.In some examples, the isolated Santalum album cytochrome P450bergamotene oxidases do not catalyze the hydroxylation of a santalene.In other examples, the isolated Santalum album cytochrome P450bergamotene oxidases catalyze the hydroxylation of a santalene. Alsoprovided herein are isolated Santalum album cytochrome P450 bergamoteneoxidases that are members of the CYP76 family.

Provided herein are isolated nucleic acid molecules encoding a Santalumalbum cytochrome P450 bergamotene oxidase polypeptide or a catalyticallyactive fragment thereof. For example, provided herein are isolatednucleic acid molecules encoding a cytochrome P450 bergamotene oxidasepolypeptide having a sequence of amino acids set forth in SEQ ID NO:6,8, 9 or 73; or a cytochrome P450 bergamotene oxidase polypeptide havinga sequence of amino acids that has at least 96% sequence identity to acytochrome P450 polypeptide set forth in SEQ ID NO:6, 8, 9 or 73. Inanother example, provided herein are isolated nucleic acid moleculesencoding a cytochrome P450 bergamotene oxidase polypeptide having asequence of amino acids that has at least 50% sequence identity to acytochrome P450 bergamotene oxidase polypeptide set forth in SEQ IDNO:6, 8, 9 or 73. The cytochrome P450 bergamotene oxidase polypeptidecatalyzes the hydroxylation or monooxygenation of bergamotene. Forexample, the encoded cytochrome P450 bergamotene oxidase polypeptideexhibits at least 55%, 60%, 65%, 70%, 75%, 80%, 85%, 86%, 87%, 88%, 89%,90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% amino acid sequenceidentity to a sequence of amino acids set forth in SEQ ID NO:6, 8, 9 or73.

Also provided herein are isolated nucleic acid molecules encoding acytochrome P450 bergamotene oxidase polypeptide or a catalyticallyactive fragment thereof having a sequence of nucleic acids set forth inany of SEQ ID NOS:2, 4, 5 or 67; a sequence of nucleic acids having atleast 98% sequence identity to a sequence of nucleic acids set forth inany of SEQ ID NOS: 2, 4, 5 or 67; and degenerates thereof. In aparticular example, the isolated nucleic acid molecule has sequence ofnucleic acids set forth in SEQ ID NO:2, 4, 5 or 67. In some examples,the isolated nucleic acid molecule encodes a cytochrome P450 bergamoteneoxidase polypeptide having a sequence of amino acids set forth in SEQ IDNO:6, 8, 9 or 73. The provided isolated nucleic acid molecules encode acytochrome P450 bergamotene oxidase polypeptide that catalyzes theformation of a bergamotol, such as Z-α-trans-bergamotol, from abergamotene, such as α-trans-bergamotene, and/or catalyzes thehydroxylation or monooxygenation of bergamotene, such asα-trans-bergamotene. In some examples, the encoded cytochrome P450bergamotene oxidase does not catalyze the hydroxylation of a santalene.Also provided herein are cytochrome P450 bergamotene oxidasepolypeptides encoded by any of the isolated nucleic acid moleculesprovided herein.

Also provided herein are isolated nucleic acid molecules encoding aSantalum album cytochrome P450 polypeptide or catalytically activefragments thereof having a sequence of nucleic acids set forth in SEQ IDNO:1 or 72; a sequence of nucleic acids having at least 99% sequenceidentity to a sequence of nucleic acids set forth in SEQ ID NO:1 or 72;and degenerates thereof. Also provided herein are isolated nucleic acidmolecules encoding a cytochrome P450 polypeptide having a sequence ofamino acids set forth in SEQ ID NO:50 or 78; or having a sequence ofamino acids having at least 99% sequence identity to the sequence ofamino acids set forth in SEQ ID NO:50 or 78. Also provided herein areSantalum album cytochrome P450 polypeptides encoded by any of theisolated nucleic acid molecules provided herein.

Also provided herein are nucleic acid molecules encoding a cytochromeP450 polypeptide or catalytically active fragments thereof having one ormore heterologous domains or portions thereof from one or morecytochrome P450s. The domain is selected from among helix A, β strand1-1, β strand 1-2, helix B, β strand 1-5, helix B′, helix C, helix D, βstrand 3-1, helix E, helix F, helix G, helix H, β strand 5-1, β strand5-2, helix I, helix J, helix J′, helix K, β strand 1-4, β strand 2-1, βstrand 2-2, β strand 1-3, Heme domain, helix L, β strand 3-3, β strand4-1, β strand 4-2 and β strand 3-2. In some examples, the heterologousdomain or a contiguous portion thereof replaces all or a contiguousportion of the corresponding native domain of the cytochrome P450polypeptide not containing the heterologous domain. For example, theencoded modified cytochrome P450 polypeptide contains all of aheterologous domain of a different cytochrome P450. In other examples,the encoded modified cytochrome P450 polypeptide has at least 50%, 60%,70%, 80%, 90%, or 95% of contiguous amino acids of a heterologous domainfrom one or more different cytochrome P450s.

Provided herein are isolated Santalum album cytochrome P450 reductasesor catalytically active fragments thereof. For example, provided hereinare isolated Santalum album cytochrome P450 reductases that catalyze thetransfer of two electrons from NADPH to an electron acceptor, that is acytochrome P450, heme oxygenase, cytochrome b₅ or squalene epoxidase. Inparticular examples, the electron acceptor is a cytochrome P450.

Also provided herein are isolated nucleic acid molecules encoding aSantalum album cytochrome P450 reductase polypeptide or catalyticallyactive fragments thereof. For example, provided herein are isolatednucleic acid molecules encoding a cytochrome P450 reductase polypeptidehaving a sequence of amino acids set forth in SEQ ID NO:12 or 13; orencoding a cytochrome P450 reductase polypeptide having a sequence ofamino acids that has at least 80% sequence identity to a cytochrome P450reductase polypeptide set forth in SEQ ID NO:12 or 13. In anotherexample, provided herein is an isolated nucleic acid molecule encoding acytochrome P450 reductase polypeptide that exhibits at least 85%, 86%,87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% aminoacid sequence identity to a sequence of amino acids set forth in SEQ IDNO:12 or 13.

Also provided herein are isolated nucleic acid molecule having asequence of nucleic acids set forth in SEQ ID NO:10 or 11; a sequence ofnucleic acids having at least 95% sequence identity to a sequence ofnucleic acids set forth in SEQ ID NO:10 or 11; and degenerates thereof.For example, provided herein is are isolated nucleic acid moleculeshaving a sequence of nucleic acids set forth in SEQ ID NO:10 or 11. Insome examples, the isolated nucleic acid molecules of encode cytochromeP405 reductase polypeptides having a sequence of amino acids that has atleast 95% sequence identity to a cytochrome P450 reductase polypeptideset forth in SEQ ID NO:12 or 13. In a particular example, the isolatednucleic acid molecule encodes a cytochrome P450 reductase polypeptidehaving a sequence of amino acids set forth in SEQ ID NO:12 or 13. Theprovided nucleic acid molecules encode a cytochrome P450 reductasepolypeptides catalyze the transfer of two electrons from NADPH to anelectron acceptor, such as a cytochrome P450, heme oxygenase, cytochromeb₅ or squalene epoxidase. In a particular example, the electron acceptoris a cytochrome P450. Also provided herein are cytochrome P450 reductasepolypeptides encoded by the nucleic acid molecules.

Also provided herein are nucleic acid molecule encoding a modifiedSantalum album cytochrome P450 reductase polypeptide or catalyticallyactive fragments thereof. For example, provided here are nucleic acidmolecules encoding modified cytochrome P450 reductase polypeptides thatcontain at least one amino acid replacement, addition or deletioncompared to the cytochrome P450 reductase polypeptide not containing themodification. In some examples, the encoded modified cytochrome P450reductase polypeptide is N- or C-terminally truncated. For example,provide herein are nucleic acid molecules encoding a modified cytochromeP450 reductase polypeptide that is N-terminally truncated. For example,the nucleic acid molecule encodes a modified cytochrome P450 reductasepolypeptide that has a sequence of amino acids set forth in SEQ ID NO:14or 15; or has a sequence of amino acids that is at least 80%, 85%, 86%,87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%identical to SEQ ID NO:14 or 15. Also provided herein are nucleic acidmolecules having a sequence of nucleic acids set forth in SEQ ID NO:63or 64; a sequence of nucleic acids having at least 95% sequence identityto a sequence of nucleic acids set forth in SEQ ID NO:63 or 64; anddegenerates thereof. The provided nucleic acid molecules encode acytochrome P450 reductase polypeptides catalyze the transfer of twoelectrons from NADPH to an electron acceptor, such as a cytochrome P450,heme oxygenase, cytochrome b₅ or squalene epoxidase. In a particularexample, the electron acceptor is a cytochrome P450. Also providedherein are cytochrome P450 reductase polypeptides encoded by the nucleicacid molecules.

Provided herein are nucleic acid molecules encoding a fusion proteincontaining a Santalum album santalene synthase or a catalytically activefragment thereof and/or a cytochrome P450 santalene oxidase orbergamotene oxidase or a catalytically active fragment thereof and/or acytochrome P450 reductase or a catalytically active fragment thereof.

Provided herein are nucleic acid molecules encoding a fusion proteincontaining santalene synthase and a cytochrome P450 santalene oxidase ora catalytically active fragment thereof. The full-length santalenesynthase is encoded by a sequence of nucleotides set forth in any of SEQID NOS:58-60 and the cytochrome P450 santalene oxidase is encoded b anynucleic acid molecule provided herein that encodes a cytochrome P450santalene oxidase. In another example, provided herein are nucleic acidmolecules encoding a santalene synthase and a cytochrome P450 santaleneoxidase,. The santalene synthase has a sequence of amino acids set forthin any of SEQ ID NOS: 17, 52 and 53, and the cytochrome P450 santaleneoxidase has a sequence of amino acids set forth in any of SEQ ID NOS:7,73, 74, 75 and 76.

Provided herein are nucleic acid molecules encoding a fusion proteincontaining santalene synthase and a cytochrome P450 bergamotene oxidaseor a catalytically active fragment thereof. The santalene synthase has asequence of nucleotides set forth in any of SEQ ID NOS:58-60 and thecytochrome P450 bergamotene oxidase is any nucleic acid moleculeprovided herein that encodes a cytochrome P450 bergamotene oxidase. Inanother example, provided herein are nucleic acid molecules encoding asantalene synthase and a cytochrome P450 bergamotene oxidase. Thesantalene synthase has a sequence of amino acids set forth in any of SEQID NOS: 17, 52 and 53 and the cytochrome P450 bergamotene oxidase has asequence of amino acids set forth in any of SEQ ID NOS:6, 8, 9 and 73.

Provided herein are nucleic acid molecules encoding a fusion proteincontaining a cytochrome P450 or a catalytically active fragment thereofand a cytochrome P450 reductase or a catalytically active fragmentthereof, where the cytochrome P450 is any nucleic acid molecule providedherein that encodes a cytochrome P450 oxidase and the cytochrome P450reductase is any nucleic acid molecule provided herein that encodes acytochrome P450 reductase. For example, provided herein are nucleic acidmolecules encoding a cytochrome P450 that has a sequence of amino acidsset forth in any of SEQ ID NOS:6-9 and 73-78 and a cytochrome P450reductase that has a sequence of amino acids set forth in any of SEQ IDNOS:12-15.

In some examples, in the nucleic acid molecules provided herein encodinga fusion protein, the santalene synthase and/or cytochrome P450santalene oxidase or bergamotene oxidase and/or cytochrome P450reductase are linked directly. In other examples, in the nucleic acidmolecules provided herein encoding a fusion protein, the santalenesynthase and/or cytochrome P450 santalene oxidase or bergamotene oxidaseand/or cytochrome P450 reductase are linked via a linker.

Also provided herein are vectors containing any nucleic acid moleculeprovided herein, including nucleic acid molecules encoding cytochromeP450s, such as santalene oxidases and bergamotene oxidases, cytochromeP450 reductases, modified cytochrome P450 reductases and fusionproteins. In some examples, the vector is a prokaryotic vector, a viralvector, or an eukaryotic vector. For example, the vector is a yeastvector. Also provided herein are cells containing any vector providedherein. Also provided herein are cells containing any nucleic acidmolecule provided herein, including nucleic acid molecules encodingcytochrome P450s, such as santalene oxidases and bergamotene oxidases,cytochrome P450 reductases, modified cytochrome P450 reductases andfusion proteins. In some examples, the cell is a prokaryotic cell or aneukaryotic cell. In other examples, the cells is selected from among abacteria, yeast, insect, plant or mammalian cell. In an example, thecell is a yeast cell. Included among yeast cells is a Saccharomycesgenus cell and a Pichia genus cell. For example, the cell is aSaccharomyces cerevisiae cell. In another example, the cell is anEscherichia coli cell. Thus, provided are of recombinant cells,including yeast cells, for production of santalols and bergamotol.

The cells can include nucleic acid encoding a synthase, such assantalene synthase, such as a Santalum album synthase, to catalyzeproduction of a substrate for the P450 enzymes provided herein.

Also provided herein are cells that express a cytochrome P450 santaleneoxidase polypeptide, a cytochrome P450 bergamotene oxidase polypeptide,a cytochrome P450 reductase polypeptide and/or a fusion proteincontaining a Santalum album santalene synthase and/or a cytochrome P450santalene oxidase or bergamotene synthase and/or a cytochrome P450reductase. Also provided herein are transgenic plants containing anyvector provided herein. In some examples, the transgenic plant is atobacco plant.

Provided herein are methods for producing a cytochrome P450 polypeptide,by: introducing a nucleic acid molecule provided herein that encodes acytochrome P450 polypeptide or any vector provided herein that encodes acytochrome P450 polypeptide into a cell; culturing the cell underconditions suitable for expression of the cytochrome P450 polypeptideencoded by the nucleic acid or vector; and, optionally isolating thecytochrome P450 polypeptide.

Provided herein are methods for producing a cytochrome P450 reductasepolypeptide, by: introducing a nucleic acid molecule provided hereinthat encodes a cytochrome P450 reductase polypeptide or any vectorprovided herein that encodes a cytochrome P450 reductase polypeptideinto a cell; culturing the cell under conditions suitable for expressionof the cytochrome P450 reductase polypeptide encoded by the nucleic acidor vector; and, optionally isolating the cytochrome P450 reductasepolypeptide.

Provided herein are methods for production of a santalol, bergamotoland/or mixtures thereof, by: (a) contacting a santalene and/orbergamotene with a cytochrome P450 santalene oxidase or bergamoteneoxidase under conditions suitable for the formation of a santalol,bergamotol and/or mixtures thereof; and (b) optionally isolating thesantalol, bergamotol and/or mixtures thereof. In some examples, step (a)is effected in vitro or in vivo. For example, step (a) is effected invivo in a cell transformed with a nucleic acid molecule or vectorencoding a cytochrome P450 santalene oxidase or bergamotene oxidasepolypeptide, whereby the cytochrome P450 santalene oxidase orbergamotene oxidase polypeptide encoded by the nucleic acid molecule orvector is expressed; and the cytochrome P450 santalene oxidase orbergamotene oxidase polypeptide catalyzes the formation of santaloland/or bergamotol from santalene and/or bergamotene.

Provided herein is a host cell containing a nucleic acid moleculeencoding a cytochrome P450 or cytochrome P450 polypeptide providedherein. The nucleic acid molecule and cytochrome P450 polypeptide isheterologous to the cell. In some examples, the host cell furthercontains nucleic acid encoding a synthase that produces a terpenesubstrate of a cytochrome P450. In some examples, the synthase isheterologous to the host cell. In particular examples, the terpenesynthase is a santalene synthase, such as a terpene synthase thatcatalyzes the formation of santalene and/or bergamotene. For example,the terpene synthase has a sequence of amino acids set forth in any ofSEQ ID NOS:17, 52 and 53 or a sequence of amino acids that is at least80%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%,98%, 99% identical to any of SEQ ID NOS:17, 52 and 53. In some examples,the host cell is a prokaryotic cell or an eukaryotic cell that isselected from among a bacteria, yeast, insect, plant or mammalian cell.In a particular example, the host cell is a yeast cell that is aSaccharomyces genus cell or a Pichia genus cell. For example, the hostcell is a Saccharomyces cerevisiae cell. In other examples, the hostcell is an Escherichia coli cell. In some examples, the host cellproduces an acyclic pyrophosphate terpene precursor, such as farnesyldiphosphate. In particular examples, the host cell produces farnesyldiphosphate natively or is modified to produce more farnesyl diphosphatecompared to an unmodified cell. Also provided herein is a method forproduction of a santalol, bergamotol and/or mixtures thereof, saidmethod including the steps of culturing any of the host cell providedherein under conditions suitable for the formation of a santalol,bergamotol and/or mixtures thereof; and optionally isolating thesantalol, bergamotol and/or mixtures thereof.

Provided herein are methods for production of a santalol, bergamotoland/or mixtures thereof, by: (a) contacting an acyclic pyrophosphateterpene precursor with a santalene synthase under conditions suitablefor the formation of a santalene and/or bergamotene; (b) contacting theresulting santalene and/or bergamotene with a cytochrome P450 santaleneoxidase or bergamotene oxidase under conditions suitable for theformation of a santalol, bergamotol and/or mixture thereof to produce asantalol, bergamotol or mixture thereof; and (c) optionally isolatingthe santalene and bergamotene produced in step (a) or the santalol,bergamotol, and/or mixtures thereof produced in step (b). In someexamples, step (a) and/or step (b) is/are performed in vitro or in vivo.For example, step (a) is performed in vivo in a cell transformed with anucleic acid molecule encoding a santalene synthase, whereby thesantalene synthase encoded by the nucleic acid molecule is expressed;and the santalene synthase catalyzes the formation of santalene andbergamotene from the acyclic pyrophosphate terpene precursor; and/orstep (b) is effected in vivo in a cell transformed with a nucleic acidmolecule or vector encoding a cytochrome P450 santalene oxidase orbergamotene oxidase polypeptide, whereby the cytochrome P450 santaleneoxidase or bergamotene oxidase polypeptide encoded by the nucleic acidmolecule or vector is expressed; and the cytochrome P450 santaleneoxidase or bergamotene oxidase polypeptide catalyzes the formation ofsantalol and/or bergamotol from santalene and/or bergamotene. In suchexamples, the acyclic pyrophosphate terpene precursor can be a farnesylpyrophosphate. In

In any of the methods provided herein, the call can be a prokaryoticcell or an eukaryotic cell that is selected from among a bacteria,yeast, insect, plant or mammalian cell. In some examples, the cell is ayeast cell that is a Saccharomyces genus cell or a Pichia genus cell,such as a Saccharomyces cerevisiae cell. In some examples, the cell ismodified to produce more FPP compared to an unmodified cell. In someexamples of the methods, the cell is modified to produce a santalenesynthase. For example, the cell is modified to produce a santalenesynthase that has a sequence of amino acids set forth in SEQ ID NO:17,52 or 53 or a synthase having at least 80%, 85%, 90%, 95% sequenceidentity therewith.

In some examples of the methods provided herein the santalene orbergamotene is an α-santalene, β-santalene, epi-β-santalene orα-trans-bergamotene. In some examples, the santalol or bergamotol is anα-santalol, β-santalol, epi-β-santalol or α-trans-bergamotol. In someexamples, the santalol or bergamotol is an (E)-α-santalol,(Z)-α-santalol, (E)-β-santalol, (Z)-β-santalol, (E)-epi-β-santalol,(Z)-epi-β-santalol, (Z)-α-trans-bergamotol or (E)-α-trans-bergamotol. Infurther examples of the provided methods, the santalene, bergamotene,santalol, bergamotol or mixtures thereof are isolated by extraction withan organic solvent and/or column chromatography.

In some examples of the provided methods, santalene and/or bergamoteneis contacted with a cytochrome P450 santalene oxidase that is: acytochrome P450 santalene oxidase polypeptide provided herein; acytochrome P450 santalene oxidase polypeptide provided herein encoded byany nucleic acid molecule provided herein; a nucleic acid moleculeprovided herein that encodes a cytochrome P450 santalene oxidase; or avector provided herein that encodes a cytochrome P450 santalene oxidase,whereby santalol and/or bergamotol are produced.

In some examples of the provided methods, bergamotene is contacted witha cytochrome P450 bergamotene oxidase that is: a cytochrome P450bergamotene oxidase polypeptide provided herein; a cytochrome P450bergamotene oxidase polypeptide provided herein encoded by any nucleicacid molecule provided herein; a nucleic acid molecule provided hereinthat encodes a cytochrome P450 bergamotene oxidase; or a vector providedherein that encodes a cytochrome P450 bergamotene oxidase, wherebybergamotol is produced.

Also provided herein are methods for production of a santalol,bergamotol and/or mixtures thereof. Each of steps (a) and (b) can beeffected simultaneously or sequentially. In one example, steps (a) and(b) are effected simultaneously with a nucleic acid molecule encoding afusion polypeptide containing a santalene synthase and a cytochrome P450santalene oxidase or bergamotene oxidase; or a fusion polypeptidecontaining a santalene synthase and a cytochrome P450 santalene oxidaseor bergamotene oxidase. In particular examples, santalene and/orbergamotene is contacted with a nucleic acid molecule provided hereinthat encodes a fusion polypeptide; or a fusion polypeptide encoded by anucleic acid molecule provided herein.

BRIEF DESCRIPTION OF THE FIGURES

FIG. 1 depicts the chemical structures of (Z)-α-santalol (1),(E)-α-santalol (9), (Z)-β-santalol (2), (E)-β-santalol (10),(E)-epi-β-santalol (3), (Z)-epi-β-santalol (11), (Z)-α-trans-bergamotol(4), (E)-α-trans-bergamotol (12), α-santalene (5), β-santalene (6),epi-β-santalene (7) and α-trans-bergamotene (8).

FIGS. 2A-2B depict the alignment of the santalene oxidase set forth inSEQ ID NO:7 with the bergamotene oxidases set forth in SEQ ID NOS:6, 8and 9. A “*” means that the aligned residues are identical, a “:” meansthat aligned residues are not identical, but are similar and containconservative amino acids residues at the aligned position, and a “.”means that the aligned residues are similar and containsemi-conservative amino acid residues at the aligned position.

FIGS. 3A-3C depict the alignment of Santalum album cytochrome P450reductases set forth in SEQ ID NOS:12 and 13 with Arabidopsis thalianacytochrome P450 reductases set forth in SEQ ID NOS:46 and 58. A “*”means that the aligned residues are identical, a “:” means that alignedresidues are not identical, but are similar and contain conservativeamino acids residues at the aligned position, and a “.” means that thealigned residues are similar and contain semi-conservative amino acidresidues at the aligned position.

FIG. 4 depicts the neighbor joining phylogeny of the predicted proteinsequences of SaCYP76F38v1 (SaCYP76-G5), SaCYP76F39v1 (SaCYP76-G10),SaCYP76F37v1 (SaCYP76-G11) and SaCYP76F38v2 (SaCYP76-G12) and cytochromeP450 enzymes for terpenoid metabolism, as described in Example 4.

FIGS. 5A-5B depict the alignment of the santalene oxidase set forth inSEQ ID NO:7 and the bergamotene oxidase set forth in SEQ ID NO:6 withcytochrome P450BM-3 set forth in SEQ ID NO:66. A “*” means that thealigned residues are identical, a “:” means that aligned residues arenot identical, but are similar and contain conservative amino acidsresidues at the aligned position, and a “.” means that the alignedresidues are similar and contain semi-conservative amino acid residuesat the aligned position.

FIGS. 6A-6D depict the GC-MS chromatogram of products extracted from invivo assays with SaCYP76F38v1 (SaCYP76-G5) (FIG. 6A), SaCYP76F37v1(SaCYP76-G11) (FIG. 6B), SaCYP76F38v2 (SaCYP76-G12) (FIG. 6C) and emptyvector (FIG. 6D) as described in Example 10. The peaks are identified inTable 13.

FIG. 7 depicts the total ion chromatogram of S. album oil extract. Thepeaks are identified in Table 13.

FIGS. 8A-8C depict the GC-MS chromatogram of S. album native oil (FIG.8A) and of products extracted from in vivo assays with SaCYP76F39v1(SaCYP76-G10) (FIG. 8B) and empty vector (FIG. 8C) as described inExample 10. The peaks are identified in Table 11.

FIGS. 9A-9B depict the GC-MS chromatogram of S. album native oil (FIG.9A) and of products extracted from in vitro assays with SaCYP76F39v1(SaCYP76-G10) (FIG. 9B) as described in Example 11. The peaks areidentified in Table 11.

FIG. 10 depicts the neighbor-joining phylogeny of the protein sequencesof the S. album CYP76Fs and related terpene-modifying cytochrome P450,as described in Example 4. The highlighted CYP76Fs indicated those inclade I (marked with I) and clade II (marked with II).

FIGS. 11A-11B depict the GC-MS analysis (extracted ion chromatograms) ofproducts formed in vivo in yeast cells expressing SaSSY, SaCPR2 andSaCYP76F39v1 (SaCYP76-G10) (FIG. 11A) and empty vector (FIG. 11B). Thepeaks are identified in Table 12. Peaks marked with the symbol (*)correspond to farnesol which is also produced in yeast cells withoutSaCYP76F. Peaks in FIG. 11A marked with the symbol (#) represent yeastin vivo modifications of santalols (see FIGS. 12A and 12B).

FIGS. 12A-12B depict the GC-MS analysis (extracted ion chromatograms) ofsesquiterpenols of natural sandalwood oil sample before (FIG. 12A) andafter (FIG. 12B) overnight incubation with yeast cells, which do notcontain a SaCYP76F gene. Peaks in FIG. 12B marked with the symbol (#)represent yeast in vivo modifications of santalols independent ofSaCYP76F. The peaks are identified in Table 12.

FIGS. 13A-13D depict the GC-MS analysis (extracted ion chromatograms) ofcompounds formed in vivo in yeast cells expressing SaSSy, SaCPR2 andSaCYP76F39v2 (SaCYP76-G15) (FIG. 13A), SaCYP76F40 (SaCYP76-G16) (FIG.13B), SaCYP76F41 (SaCYP76-G17) (FIG. 13C), or SaCYP76F42 (SaCYP76-G13)(FIG. 13D). The peaks are identified in Table 12. Peaks marked with thesymbol (*) correspond to farnesol which is produced in yeast cellswithout SaCYP76F. Peaks marked with the symbol (#) represent yeast invivo modifications of santalols independent of SaCYP76F.

FIGS. 14A-14D depict the GC-MS analysis (extracted ion chromatograms) ofcompounds formed in vivo in yeast cells expressing SaSSy, SaCPR2 andSaCYP76F38v1 (SaCYP76-G5) (FIG. 14A), SaCYP76F38v2 (SaCYP76-G12) (FIG.14B), SaCYP76F37v1 (SaCYP76-G11) (FIG. 14C), SaCYP76F37v2 (SaCYP76-G14)(FIG. 14D), or SaCYP76F43 (SaCYP76-G18) (FIG. 14E). The peaks areidentified in Table 12. Peaks marked with the symbol (*) correspond tofarnesol which is produced in yeast cells without SaCYP76F. Peaks markedwith the symbol (#) represent yeast in vivo modifications of santalolsindependent of SaCYP76F.

FIG. 15A depicts the GC-MS analysis (extracted ion chromatograms) ofproducts formed in vitro with SaCYP76F39v1 (SaCYP76-G10) and asesquiterpene mixture of α-, β- and epi-β-santalene andα-trans-bergamotene (FIG. 15A). FIG. 15B depicts the GC-MS analysis(extracted ion chromatograms) of authentic S. album oil. FIG. 15Cdepicts the GC-MS analysis (extracted ion chromatograms) from controlassays performed with microsomes isolated from yeast cells transformedwith an empty vector. The peaks are identified in Table 12.

FIGS. 16A-16E depict the GC-MS analysis (extracted ion chromatograms) ofproducts formed in vitro with a sesquiterpene mixture of α-, β- andepi-β-santalene and α-trans-bergamotene as the substrate and clade ISaCYP76F cDNAs SaCYP76F39v2 (SaCYP76-G15) (FIG. 16A); SaCYP76F40(SaCYP76-G16) (FIG. 16B); SaCYP76F41 (SaCYP76-G17) (FIG. 16C);SaCYP76F42 (SaCYP76-G13) (FIG. 16D); or empty vector as control (FIG.16E). The peaks are identified in Table 12.

FIGS. 17A-17E depict the GC-MS analysis (extracted ion chromatograms) ofproducts formed in vitro with a sesquiterpene mixture of α-, β- andepi-β-santalene and α-trans-bergamotene as the substrate and clade IISaCYP76F cDNAs SaCYP76F38v1 (SaCYP76-G5) (FIG. 17A); SaCYP76F38v2(SaCYP76-G12) (FIG. 17B); SaCYP76F37v1 (SaCYP76-G11) (FIG. 17C);SaCYP76F37v2 (SaCYP76-G14) (FIG. 17D); or empty vector as control (FIG.17E). The peaks are identified in Table 12.

FIG. 18 depicts the reduced CO-difference spectra of isolated microsomescontaining S. album CYP76F proteins. CO-difference spectra of microsomalfractions from S. cerevisiae harboring a cytochrome P450 or an emptyvector are shown. Concentration of SaCYP76F proteins are given based onan extinction coefficient of 91,000 M⁻¹cm⁻¹.

FIGS. 19A-19D depict the GC-MS analysis (extracted ion chromatograms) ofa sesquiterpene mixture produced with a recombinant yeast strainexpressing SaSSy (FIG. 19A) and fractions separated by TLC (FIGS.19B-19D). The sesquiterpene mixture and fractions were prepared asdescribed in Example 9. The peaks correspond to: α-santalene, peak 1;α-exo-bergamotene, peak 2; epi-β-santalene, peak 3; and β-santalene,peak 4.

FIGS. 20A-20G depict the GC-MS analysis (extracted ion chromatograms) ofproducts formed in vitro with SaCYP76F39v1 (SaCYP76-G10) or SaCYP76F37v1(SaCYP76-G11) using partially purified substrates. FIGS. 20A-20C depictproduct profiles in assays with SaCYP76F39v1 (SaCYP76-G10) usingα-santalene (FIG. 20A), α-exo-bergamotene (FIG. 20B), or epi-β-santaleneand β-santalene (FIG. 20C) as the substrates. FIGS. 20D-20F depictproduct profiles in assays with SaCYP76F37v1 (SaCYP76-G11) usingα-santalene (FIG. 20D), α-exo-bergamotene (FIG. 20E), or epi-β-santaleneand β-santalene (FIG. 20F) as the substrates. FIG. 20G depicts theextracted ion chromatogram for authentic Santalum album oil. The peaksare identified in Table 12.

FIGS. 21A-21C depict the alignment of the S. album cytochrome P450s setforth in SEQ ID NOS:6-9 and 73-78. Horizontal arrows indicate theproline region (a), oxygen binding motif (b) and heme binding motif (c).Boxes indicate the substrate recognition sites (SRS) regions originallydescribed by Gotoh (1992) J Biol Chem 267:83-90. A “*” means that thealigned residues are identical, a “:” means that aligned residues arenot identical, but are similar and contain conservative amino acidsresidues at the aligned position, and a “.” means that the alignedresidues are similar and contain semi-conservative amino acid residuesat the aligned position.

DETAILED DESCRIPTION Outline

-   A. Definitions-   B. Overview

1. Biosynthesis of Terpenoids

-   -   a. Santalols    -   b. Bergamotols

2. Cytochrome P450 Enzymes

-   -   a. Structure    -   b. Activity

3. Cytochrome P450 Reductases

-   -   a. Structure    -   b. Activity

-   C. Cytochrome P450 polypeptides and encoding nucleic acid molecules

1. Cytochrome P450 santalene oxidase polypeptides

-   -   Modified cytochrome P450 santalene oxidase polypeptides

2. Cytochrome P450 bergamotene oxidase polypeptides

-   -   Modified cytochrome P450 bergamotene oxidase polypeptides

3. Additional modifications

-   -   a. Truncated polypeptides    -   b. Polypeptides with altered activities or properties    -   c. Domain swaps    -   d. Fusion proteins

-   D. Cytochrome P450 reductase polypeptides and encoding nucleic acid    molecules

1. Cytochrome P450 reductase polypeptides

2. Modified cytochrome P450 reductase polypeptides

3. Additional modifications

-   -   a. Truncated polypeptides    -   b. Polypeptides with altered activities or properties    -   c. Domain swaps    -   d. Fusion proteins

-   E. Methods for producing modified cytochrome P450 and cytochrome    P450 reductase polypeptides and encoding nucleic acid molecules

-   F. Expression of cytochrome P450 and cytochrome P450 reductase    polypeptides and encoding nucleic acid molecules

1. Isolation of nucleic acid encoding Santalum album cytochrome P450 andcytochrome P450 reductase polypeptides

2. Generation of modified nucleic acids

3. Vectors and Cells

4. Expression systems

-   -   a. Prokaryotic cells    -   b. Yeast cells    -   c. Plants and plant cells    -   d. Insects and insect cells    -   e. Mammalian cells    -   f. Exemplary host cells

5. Purification

6. Fusion proteins

-   G. Methods for producing terpenoids and methods for detecting such    products and the activity of the cytochrome P450 and cytochrome P450    reductase polypeptides

1. Synthesis of Santalols and Bergamotols

-   -   a. Oxidation of Santalenes and Bergamotenes    -   b. Conversion of acyclic pyrophosphate terpene precursors

2. Methods for production

-   -   a. Exemplary cells    -   b. Culture of cells    -   c. Isolation and assays for detection and identification

3. Production of sandalwood oil

4. Assays for detecting enzymatic activity of cytochrome P450 andcytochrome P450 reductase polypeptides

-   -   a. Methods for determining the activity of cytochrome P450        polypeptides    -   b. Methods for determining the activity of cytochrome P450        reductase polypeptides

-   H. Examples

A. DEFINITIONS

Unless defined otherwise, all technical and scientific terms used hereinhave the same meaning as is commonly understood by one of skill in theart to which the invention(s) belong. All patents, patent applications,published applications and publications, Genbank sequences, databases,websites and other published materials referred to throughout the entiredisclosure herein, unless noted otherwise, are incorporated by referencein their entirety. In the event that there are a plurality ofdefinitions for terms herein, those in this section prevail. Wherereference is made to a URL or other such identifier or address, itunderstood that such identifiers can change and particular informationon the internet can come and go, but equivalent information can be foundby searching the internet. Reference thereto evidences the availabilityand public dissemination of such information.

As used herein, an acyclic pyrophosphate terpene precursor is anyacyclic pyrophosphate compound that is a precursor to the production ofat least one terpene, including, but not limited, farnesyl-pyrophosphate(FPP), geranyl-pyrophosphate (GPP) and geranylgeranyl-pyrophosphate(GGPP). Acyclic pyrophosphate terpene precursors are thus substrates forterpene synthases.

As used herein, a terpene is an unsaturated hydrocarbon based on theisoprene unit (C₅H₈), and having a general formula C_(5x)H_(8x), such asC₁₀H₁₆. Reference to a terpene includes acyclic, monocyclic andpolycyclic terpenes. Terpenes include, but are not limited to,monoterpenes, which contain 10 carbon atoms; sesquiterpenes, whichcontain 15 carbon atoms; diterpenes, which contain 20 carbon atoms, andtriterpenes, which contain 30 carbon atoms. Reference to a terpene alsoincludes stereoisomers of the terpene.

As used herein, a terpenoid is a chemically modified terpene. In oneexample, a terpenoid is a terpene that has been chemically modified byaddition of a hydroxyl group, such as a santalol or bergamotol.Reference to a terpenoid includes acyclic, monocyclic and polycyclicterpenoids, including monoterpenoids, sesquiterpenoids and diterpenoids.Reference to a terpenoid also includes stereoisomers of the terpenoid.

As used herein, a terpene synthase is a polypeptide capable ofcatalyzing the formation of one or more terpenes from a pyrophosphateterpene precursor. In some examples, a terpene synthase catalyzes theformation of one or more terpenes from an acyclic pyrophosphate terpeneprecursor, for example, FPP, GPP or GGPP, including, but not limited to,santalene synthase. In other examples, a terpene synthase catalyzes theformation of one or more terpenes from an acyclic pyrophosphate terpeneprecursor, including, but not limited to, santalene synthase.

As used herein, “cytochrome P450,” “cytochrome P450 oxidase,”“cytochrome P450 polypeptide,” “cytochrome P450 oxidase polypeptide” or“CYP” is a polypeptide capable of catalyzing the monooxygenation of anyterpene precursor, including monoterpenes, sesquiterpenes andditerpenes. A cytochrome P450 can catalyze the monooxygenation of aterpene or a mixture of terpenes, resulting in the production one ormore terpenoids.

For purposes herein, cytochrome P450 oxidases provided herein areenzymes with cytochrome P450 oxidase activity and have greater than orgreater than about or 50%, 55%, 60%, 65%, 70%, 75%, 76%, 77%, 78%, 79%,80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%,94%, 95%, 96%, 97%, 98%, 99% sequence identity, when aligned with thecytochrome P450 oxidase sequence set forth in SEQ ID NO:50. Reference toa cytochrome P450 oxidase includes any cytochrome P450 oxidasepolypeptide including, but not limited to, a recombinantly producedpolypeptide, synthetically produced polypeptide and a cytochrome P450oxidase polypeptide extracted or isolated from cells or plant matter,including, but not limited to, heartwood of a sandalwood tree. Exemplaryof cytochrome P450 oxidase polypeptides include those isolated fromSantalum album. Reference to a cytochrome P450 oxidase includescytochrome P450 oxidase from any genus or species, and included allelicor species variants, variants encoded by splice variants, and othervariants thereof, including polypeptides that have at least or at leastabout 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%,99% or more sequence identity to the cytochrome P450 oxidase set forthin SEQ ID NO:50 when aligned therewith. Cytochrome P450 oxidase alsoincludes catalytically active fragments thereof that retain cytochromeP450 oxidase activity.

As used herein, “cytochrome P450 santalene oxidase” or “cytochrome P450santalene oxidase polypeptide” is a polypeptide capable of catalyzingthe formation of a santalol from a santalene, for example, capable ofcatalyzing the monooxygenation or hydroxylation of a santalene. Acytochrome P450 santalene oxidase polypeptide can produce one or amixture of santalols from one or a mixture of santalenes. A cytochromeP450 santalene oxidase polypeptide is also capable of catalyzing theformation of a bergamotol from a bergamotene. For example, a cytochromeP450 santalene oxidase catalyzes the formation of α-santalol fromα-santalene, β-santalol from β-santalene, epi-β-santalol fromepi-β-santalene and/or Z-α-trans-bergamotol or E-α-trans-bergamotol fromα-trans-bergamotene.

For purposes herein, cytochrome P450 santalene oxidases provided hereinare enzymes with cytochrome P450 santalene oxidase activity and havegreater than or greater than about or 50%, 55%, 60%, 65%, 70%, 75%, 76%,77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%,91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% sequence identity, whenaligned with the cytochrome P450 santalene oxidase sequence set forth inSEQ ID NO:7, 74, 75, 76 or 77. Reference to a cytochrome P450 santaleneoxidase includes any cytochrome P450 santalene oxidase polypeptideincluding, but not limited to, a recombinantly produced polypeptide,synthetically produced polypeptide and a cytochrome P450 santaleneoxidase polypeptide extracted or isolated from cells or plant matter,including, but not limited to, heartwood of a sandalwood tree. Exemplaryof cytochrome P450 santalene oxidase polypeptides include those isolatedfrom Santalum album. Reference to a cytochrome P450 santalene oxidaseincludes cytochrome P450 santalene oxidase from any genus or species,and included allelic or species variants, variants encoded by splicevariants, and other variants thereof, including polypeptides that haveat least or at least about 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%,95%, 96%, 97%, 98%, 99% or more sequence identity to the cytochrome P450santalene oxidase set forth in SEQ ID NO:7, 74, 75, 76 or 77 whenaligned therewith. Cytochrome P450 santalene oxidase also includescatalytically active fragments thereof that retain cytochrome P450santalene oxidase activity.

As used herein, “cytochrome P450 santalene oxidase activity” or“santalene oxidase activity” refers to the ability to catalyze theformation of one or more santalols from one or more santalenes. That is,cytochrome P450 santalene oxidases catalyze the monooxygenation orhydroxylation of santalenes. Cytochrome P450 santalene oxidases alsocatalyze the hydroxylation of bergamotene. For example, cytochrome P450santalene oxidases catalyze the formation of α-santalol fromα-santalene, β-santalol from β-santalene, epi-β-santalol fromepi-β-santalene and/or Z-α-trans-bergamotol from α-trans-bergamotene.Methods to assess santalol or bergamotol formation from a reaction of asantalene or bergamotene are well known in the art and described herein.The production of a santalol or bergamotol can be assessed by methodssuch as, for example, gas chromatography-mass spectrometry (GC-MS) (seeExamples below). A cytochrome P450 exhibits cytochrome P450 santaleneoxidase activity or the ability to catalyze the formation of santalolsor bergamotol from santalenes and bergamotene if the amount of santalolsand bergamotol produced from the reaction is at least or at least about0.5%, 1%, 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or more of thetotal amount of terpenoids produced in the reaction.

As used herein, “cytochrome P450 bergamotene oxidase” or “cytochromeP450 bergamotene oxidase polypeptide” is a polypeptide capable ofcatalyzing the monooxygenation or hydroxylation of a bergamotene. Forexample, a cytochrome P450 bergamotene oxidase catalyzes the formationof Z-α-trans-bergamotol or E-α-trans-bergamotol fromα-trans-bergamotene.

For purposes herein, cytochrome P450 bergamotene oxidases providedherein are enzymes with cytochrome P450 bergamotene oxidase activity andhave greater than or greater than about or 50%, 55%, 60%, 65%, 70%, 75%,76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%,90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% sequence identity, whenaligned with the cytochrome P450 bergamotene oxidase sequence set forthin SEQ ID NO:6, 8, 9 or 73. Reference to a cytochrome P450 bergamoteneoxidase includes any cytochrome P450 bergamotene oxidase polypeptideincluding, but not limited to, a recombinantly produced polypeptide,synthetically produced polypeptide and a cytochrome P450 bergamoteneoxidase polypeptide extracted or isolated from cells or plant matter,including, but not limited to, heartwood of a sandalwood tree. Exemplaryof cytochrome P450 bergamotene oxidase polypeptides include thoseisolated from Santalum album. Reference to a cytochrome P450 bergamoteneoxidase includes cytochrome P450 bergamotene oxidase from any genus orspecies, and included allelic or species variants, variants encoded bysplice variants, and other variants thereof, including polypeptides thathave at least or at least about 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%,90%, 95%, 96%, 97%, 98%, 99% or more sequence identity to the cytochromeP450 bergamotene oxidase set forth in SEQ ID NO: 6, 8, 9 or 73 whenaligned therewith. Cytochrome P450 bergamotene oxidase also includescatalytically active fragments thereof that retain cytochrome P450bergamotene oxidase activity.

As used herein, “cytochrome P450 bergamotene oxidase activity” or“bergamotene oxidase activity” refers to the ability catalyze theformation of bergamotols from bergamotenes That is, cytochrome P450bergamotene oxidases catalyze the monooxygenation or hydroxylation ofbergamotene. For example, cytochrome P450 bergamotene oxidases catalyzethe formation of Z-α-trans-bergamotol from α-trans-bergamotene. Methodsto assess bergamotol formation from a reaction of a bergamotene are wellknown in the art and described herein. The production of a bergamotolcan be assessed by methods such as, for example, gas chromatography-massspectrometry (GC-MS) (see Examples below). A cytochrome P450 exhibitscytochrome P450 bergamotene oxidase activity or the ability to catalyzethe formation of bergamotol from bergamotene if the amount of bergamotolproduced from the reaction is at least or at least about 0.5%, 1%, 5%,10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or more of the total amountof terpenoids produced in the reaction.

As used herein, α-santalol is a sesquiterpenoid having the followingstructure or isomers or stereoisomers thereof:

As used herein, β-santalol is a sesquiterpenoid having the followingstructure or isomers or stereoisomers thereof:

As used herein, epi-β-santalol is a sesquiterpenoid having the followingstructure or isomers or stereoisomers thereof:

As used herein, Z-α-trans-bergamotol or Z-α-exo-bergamotol is asesquiterpenoid having the following structure or isomers orstereoisomers thereof:

As used herein, E-α-trans-bergamotol or E-α-exo-bergamotol is asesquiterpenoid having the following structure or isomers orstereoisomers thereof:

As used herein, α-santalene is a sesquiterpene having the followingstructure or isomers or stereoisomers thereof:

As used herein, β-santalene is a sesquiterpene having the followingstructure or isomers or stereoisomers thereof:

As used herein, epi-β-santalene is a sesquiterpene having the followingstructure or isomers or stereoisomers thereof:

As used herein, α-trans-bergamotene or α-exo-bergamotene is asesquiterpene having the following structure or isomers or stereoisomersthereof:

As used herein, “cytochrome P450 reductase” or “CPR” is a polypeptidecapable of catalyzing the transfer of two electrons from NADPH to anelectron acceptor, such as a cytochrome P450. For purposes herein,cytochrome P450 reductases provided herein are enzymes with cytochromeP450 reductase activity and have greater than or greater than about or50%, 55%, 60%, 65%, 70%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%,84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%,98%, 99% sequence identity, when aligned with the cytochrome P450reductase sequence set forth in SEQ ID NO:12 or 13. Reference to acytochrome P450 reductase includes any cytochrome P450 reductasepolypeptide including, but not limited to, a recombinantly producedpolypeptide, synthetically produced polypeptide and a cytochrome P450reductase polypeptide extracted or isolated from cells or plant matter,including, but not limited to, heartwood of a sandalwood tree. Exemplaryof cytochrome P450 reductase polypeptides include those isolated fromSantalum album. Reference to a cytochrome P450 reductase includes acytochrome P450 reductase from any genus or species, and includedallelic or species variants, variants encoded by splice variants, andother variants thereof, including polypeptides that have at least or atleast about 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%,98%, 99% or more sequence identity to the cytochrome P450 reductase setforth in SEQ ID NO:12 or 13 when aligned therewith. Cytochrome P450reductase also includes catalytically active fragments thereof thatretain cytochrome P450 reductase activity.

As used herein, “cytochrome P450 reductase activity” refers to theability to catalyze the transfer of two electrons from NADPH to anelectron acceptor, such as a cytochrome P450. Methods to assesscytochrome P450 reductase activity are well known in the art anddescribed herein. For example, cytochrome P450 reductase activity can bedetermined by reduction of an artificial electron receptor, such ascytochrome c.

As used herein, “wild type” or “native” with reference to a cytochromeP450 or cytochrome P450 reductase refers to a cytochrome P450polypeptide or cytochrome P450 reductase polypeptide encoded by a nativeor naturally occurring cytochrome P450 gene or cytochrome P450 reductasegene, including allelic variants, that are present in an organism,including a plant, in nature. Reference to wild type cytochrome P450 orcytochrome P450 reductase without reference to a species is intended toencompass any species of a wild type cytochrome P450 or cytochrome P450reductase.

As used herein, species variants refer to variants in polypeptides amongdifferent species, including different sandalwood species, such Santalumalbum, Santalum australocaledonicum, Santalum spicatum and Santalummurrayanum.

Generally, species variants share at least or at least about 40%, 50%,60%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% ormore sequence identity. Corresponding residues between and among speciesvariants can be determined by comparing, generally one-by-one to thesame reference sequence, and aligning each sequence with the referencesequence to maximize the number of matching nucleotides or amino acidresidues. The position of interest is then given the number assigned inthe reference nucleic acid molecule or polypeptide. Alignment can beeffected manually or by eye, particularly, where sequence identity isgreater than 80%. To determine sequence identity among a plurality ofvariants, alignments are effected one-by-one against the same referencepolypeptide.

As used herein, an allelic variant or allelic variation references anyof two or more alternative forms of a gene occupying the samechromosomal locus. Allelic variation arises naturally through mutation,and can result in phenotypic polymorphism within populations. Genemutations can be silent (no change in the encoded polypeptide) or canencode polypeptides having altered amino acid sequence. The term“allelic variant” also is used herein to denote a protein encoded by anallelic variant of a gene. Typically the reference form of the geneencodes a wild type form and/or predominant form of a polypeptide from apopulation or single reference member of a species. Typically, allelicvariants, which include variants between and among species typically,have at least 80%, 90% or greater amino acid identity with a wild typeand/or predominant form from the same species; the degree of identitydepends upon the gene and whether comparison is interspecies orintraspecies. Generally, intraspecies allelic variants have at leastabout 80%, 85%, 90% or 95% identity or greater with a wild type and/orpredominant form, including 96%, 97%, 98%, 99% or greater identity witha wild type and/or predominant form of a polypeptide. Reference to anallelic variant herein generally refers to variations n proteins amongmembers of the same species.

As used herein, a splice variant refers to a variant produced bydifferential processing of a primary transcript of genomic DNA thatresults in more than one type of mRNA.

As used herein, a “modified cytochrome P450” or “modified cytochromeP450 polypeptide” or “modified CYP” refers to a cytochrome P450polypeptide that has one or more amino acid differences compared to anunmodified or wild type cytochrome P450 polypeptide. The one or moreamino acid differences can be amino acid mutations such as one or moreamino acid replacements (substitutions), insertions or deletions, or canbe insertions or deletions or replacements of entire domains or portionsthereof, and any combination thereof. Modification can be effected byany mutational protocol, including gene shuffling methods. Typically, amodified cytochrome P450 polypeptide has one or more modifications inprimary sequence compared to an unmodified cytochrome P450 polypeptide.For example, a modified cytochrome P450 polypeptide provided herein canhave at least 1, 5, 10, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26,27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44,45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62,63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80,81, 82, 83, 84, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135 ormore amino acid differences compared to an unmodified cytochrome P450polypeptide. Typically, the modified cytochrome P450 polypeptide willhave 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14 or 15 amino acidreplacements, but can include more, particularly when domains orportions thereof are swapped. Any modification is contemplated as longas the resulting polypeptide has at least one cytochrome P450 activityassociated with the wild type cytochrome P450, such as, for example,catalytic activity, monooxygenase activity, and/or the ability tocatalyze the formation of a terpenoid from a terpene. Generally, theresulting cytochrome P450 polypeptide will have at least 50% sequenceidentity with the wild type cytochrome P450 polypeptide provided herein.

As used herein, a “modified cytochrome P450 santalene oxidase” or“modified cytochrome P450 santalene oxidase polypeptide” refers to acytochrome P450 santalene oxidase polypeptide that has one or more aminoacid differences compared to an unmodified or wild type cytochrome P450santalene oxidase polypeptide. The one or more amino acid differencescan be amino acid mutations such as one or more amino acid replacements(substitutions), insertions or deletions, or can be insertions ordeletions or replacements of entire domains or portions thereof, and anycombination thereof. Modification can be effected by any mutationalprotocol, including gene shuffling methods. Typically, a modifiedcytochrome P450 santalene oxidase polypeptide has one or moremodifications in primary sequence compared to an unmodified cytochromeP450 santalene oxidase polypeptide. For example, a modified cytochromeP450 santalene oxidase polypeptide provided herein can have at least 1,5, 10, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30,31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48,49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66,67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84,85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135 or more amino aciddifferences compared to an unmodified cytochrome P450 santalene oxidasepolypeptide. Typically, the modified cytochrome P450 santalene oxidasepolypeptide will have 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14 or15 amino acid replacements, but can include more, particularly whendomains or portions thereof are swapped. Any modification iscontemplated as long as the resulting polypeptide has at least onecytochrome P450 santalene oxidase activity associated with the wild typecytochrome P450 santalene oxidase, such as, for example, catalyticactivity, the ability to catalyze the formation of santalols orbergamotols from santalenes or bergamotenes. Generally, the resultingcytochrome P450 polypeptide santalene oxidase will have at least 50%sequence identity with the wild type cytochrome P450 santalene oxidasepolypeptide provided herein.

As used herein, a “modified cytochrome P450 bergamotene oxidase” or“modified cytochrome P450 bergamotene oxidase polypeptide” refers to acytochrome P450 bergamotene oxidase polypeptide that has one or moreamino acid differences compared to an unmodified or wild type cytochromeP450 bergamotene oxidase polypeptide. The one or more amino aciddifferences can be amino acid mutations such as one or more amino acidreplacements (substitutions), insertions or deletions, or can beinsertions or deletions or replacements of entire domains or portionsthereof, and any combination thereof. Modification can be effected byany mutational protocol, including gene shuffling methods. Typically, amodified cytochrome P450 bergamotene oxidase polypeptide has one or moremodifications in primary sequence compared to an unmodified cytochromeP450 bergamotene oxidase polypeptide. For example, a modified cytochromeP450 bergamotene oxidase polypeptide provided herein can have at least1, 5, 10, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29,30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47,48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65,66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83,84, 85, 90, 95, 100, 105, 110, 115, 120, 125, 130, 135 or more aminoacid differences compared to an unmodified cytochrome P450 polypeptide.Typically, the modified cytochrome P450 bergamotene oxidase polypeptidewill have 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14 or 15 amino acidreplacements, but can include more, particularly when domains orportions thereof are swapped. Any modification is contemplated as longas the resulting polypeptide has at least one cytochrome P450bergamotene oxidase activity associated with the wild type cytochromeP450 bergamotene oxidase polypeptide, such as, for example, catalyticactivity, the ability to catalyze the formation of bergamotols frombergamotenes. Generally, the resulting cytochrome P450 polypeptidebergamotene oxidase will have at least 50% sequence identity with thewild type cytochrome P450 bergamotene oxidase polypeptide providedherein.

As used herein, a “modified cytochrome P450 reductase” or “modified CPR”refers to a cytochrome P450 polypeptide that has one or more amino aciddifferences compared to an unmodified or wild type cytochrome P450reductase polypeptide. The one or more amino acid differences can beamino acid mutations such as one or more amino acid replacements(substitutions), insertions or deletions, or can be insertions ordeletions or replacements of entire domains or portions thereof, and anycombination thereof. Modification can be effected by any mutationalprotocol, including gene shuffling methods. Typically, a modifiedcytochrome P450 reductase polypeptide has one or more modifications inprimary sequence compared to an unmodified cytochrome P450 reductasepolypeptide. For example, a modified cytochrome P450 reductasepolypeptide provided herein can have at least 1, 5, 10, 15, 16, 17, 18,19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36,37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54,55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72,73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 90, 95, 100, 105,110, 115, 120, 125, 130, 135 or more amino acid differences compared toan unmodified cytochrome P450 reductase polypeptide. Typically, themodified cytochrome P450 reductase polypeptide will have 1, 2, 3, 4, 5,6, 7, 8, 9, 10, 11, 12, 13, 14 or 15 amino acid replacements, but caninclude more, particularly when domains or portions thereof are swapped.Any modification is contemplated as long as the resulting polypeptidehas at least one cytochrome P450 reductase activity associated with thewild type cytochrome P450 reductase, such as, for example, catalyticactivity, the ability to transfer two electrons to an electron receptor,such as a cytochrome P450. Generally, the resulting cytochrome P450reductase polypeptide will have at least 50% sequence identity with thewild type cytochrome P450 reductase polypeptide provided herein.

As used herein, corresponding residues refers to residues that occur ataligned loci. Related or variant polypeptides are aligned by any methodknown to those of skill in the art. Such methods typically maximizematches, and include methods such as manual alignments and thoseproduced by the numerous alignment programs available (for example,BLASTP) and others known to those of skill in the art. By aligning thesequences of polypeptides, one skilled in the art can identifycorresponding residues, using conserved and identical amino acidresidues as guides. Corresponding positions also can be based onstructural alignments, for example by using computer simulatedalignments of protein structure. For example, corresponding residuesbetween a cytochrome P450 santalene oxidase synthase and cytochrome P450bergamotene oxidase synthase are shown in FIGS. 2A-2B and 21A-21C andcorresponding residues between Arabidopsis thaliana cytochrome P450reductases and Santalum album cytochrome P450 reductases are shown inFIG. 3A-3C.

As used herein, domain or region (typically a sequence of at least threeor more, generally 5 or 7 or more amino acids) refers to a portion of amolecule, such as a protein or the encoding nucleic acids, that isstructurally and/or functionally distinct from other portions of themolecule and is identifiable. A protein can have one, or more than one,distinct domains. For example, a domain can be identified, defined ordistinguished by homology of the sequence therein to related familymembers, such as other terpene synthases. A domain can be a linearsequence of amino acids or a non-linear sequence of amino acids. Manypolypeptides contain a plurality of domains. Such domains are known, andcan be identified by, those of skill in the art. For exemplificationherein, definitions are provided, but it is understood that it is wellwithin the skill in the art to recognize particular domains by name. Ifneeded, appropriate software can be employed to identify domains. Forexample, as discussed above, corresponding domains in differentcytochrome P450s or cytochrome P450 reductases can be identified bysequence alignments, such as using tools and algorithms well known inthe art (for example, BLASTP).

As used herein, a functional domain refers to those portions of apolypeptide that is recognized by virtue of a functional activity, suchas catalytic activity. A functional domain can be distinguished by itsfunction, such as by catalytic activity, or an ability to interact witha biomolecule, such as substrate binding or metal binding. In someexamples, a domain independently can exhibit a biological function orproperty such that the domain independently or fused to another moleculecan perform an activity, such as, for example catalytic activity orsubstrate binding.

As used herein, a structural domain refers to those portions of apolypeptide chain that can form an independently folded structure withina protein made up of one or more structural motifs.

As used herein, “heterologous” with respect to an amino acid or nucleicacid sequence refers to portions of a sequence that is not present in anative polypeptide or encoded by a polynucleotide. For example, aportion of amino acids of a polypeptide, such as a domain or region orportion thereof, for a cytochrome P450 santalene oxidase synthase isheterologous thereto if such amino acids is not present in a native orwild type cytochrome P450 santalene oxidase synthase (e.g. as set forthin SEQ ID NO:7), or encoded by the polynucleotide encoding therefor.Polypeptides containing such heterologous amino acids or polynucleotidesencoding therefor are referred to as “chimeric polypeptides” or“chimeric polynucleotides,” respectively.

As used herein, the phrase “a property of the modified cytochrome P450is improved compared to the first cytochrome P450” refers to a desirablechange in a property of a modified cytochrome P450 compared to acytochrome P450 that does not contain the modification(s). Typically,the property or properties are improved such that the amount of adesired terpenoid produced from the reaction of a terpene substrate withthe modified cytochrome P450 synthase is increased compared to theamount of the desired terpenoid produced from the reaction of asubstrate with a cytochrome P450 synthase that is not so modified.Exemplary properties that can be improved in a modified cytochrome P450synthase include, for example, terpenoid production, catalytic activity,product distribution, substrate specificity, regioselectivity andstereoselectivity. One or more of the properties can be assessed usingmethods well known in the art to determine whether the property had beenimproved (i.e. has been altered to be more desirable for the productionof a desired terpenoid or terpenoids).

As used herein, terpenoid production (also referred to as terpenoidyield) refers to the amount (in weight or weight/volume) of terpenoidproduced from the reaction of a terpene with a cytochrome P450.Reference to total terpenoid production refers to the total amount ofall terpenoids produced from the reaction, while reference to particularterpenoid production refers to the amount of a particular terpenoid(e.g. β-santalol and α-santalol), produced from the reaction.

As used herein, an improved terpenoid production refers to an increasein the total amount of terpenoid (i.e. improved total terpenoidproduction) or an increase in the particular amount of terpenoidresulting from the reaction of a terpene with a modified cytochrome P450compared to the amount produced from the reaction of the same terpenewith a cytochrome P450 that is not so modified. The amount of terpenoid(total or particular) produced from the reaction of a terpene with acytochrome P450 can be increased by at least or at least about 1%, 3%,5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100% or morecompared to the amount of terpenoid produced from the reaction of thesame terpene under the same conditions with a cytochrome P450 that isnot so modified.

As used herein, substrate specificity refers to the preference of acytochrome P450 for one target substrate over another, such as oneterpene (e.g. β-santalene, α-santalene, epi-β-santalene orα-trans-bergamotene) over another. Substrate specificity can be assessedusing methods well known in the art, such as those that calculatek_(cat)/K_(m). For example, the substrate specificity can be assessed bycomparing the relative K_(cat)/K_(m), which is a measure of catalyticefficiency, of the enzyme against various substrates (e.g. β-santalene,α-santalene, epi-β-santalene or α-trans-bergamotene).

As used herein, altered substrate specificity refers to a change insubstrate specificity of a modified cytochrome P450 polypeptide (such asa modified cytochrome P450 santalene oxidase polypeptide or cytochromeP450 bergamotene oxidase polypeptide) compared to a cytochrome P450 thatis not so modified (such as, for example, a wild type cytochrome P450santalene oxidase or cytochrome P450 bergamotene oxidase). Thespecificity (e.g. k_(cat)/K_(m)) of a modified cytochrome P450polypeptide for a substrate, such as β-santalene, α-santalene,epi-β-santalene or α-trans-bergamotene, can be altered by at least or atleast about 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100% ormore compared to the specificity of a starting cytochrome P450 for thesame substrate.

As used herein, “improved substrate specificity” refers to a change oralteration in the substrate specificity to a more desired specificity.For example, an improved substrate specificity can include an increasein substrate specificity of a modified cytochrome P450 polypeptide for adesired substrate, such as β-santalene, α-santalene, epi-β-santalene orα-trans-bergamotene. The specificity (e.g. k_(cat)/K_(m)) of a modifiedcytochrome P450 polypeptide for a substrate, such as β-santalene,α-santalene, epi-β-santalene or α-trans-bergamotene, can be increased byat least or at least about 1%, 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%,70%, 80%, 90%, 100% or more compared to the specificity of a cytochromeP450 that is not so modified.

As used herein, “product distribution” refers to the relative amounts ofdifferent terpenoids produced from the reaction between a terpene, suchas β-santalene, and a cytochrome P450, including the cytochrome P450polypeptides provided herein. The amount of a produced terpenoid can bedepicted as a percentage of the total products produced by thecytochrome P450. For example, the product distribution resulting fromreaction of β-santalene with a cytochrome P450 santalene oxidase can be90% (weight/volume) β-santalol and 10% (weight/volume) other compounds.Methods for assessing the type and amount of a terpenoid in a solutionare well known in the art and described herein, and include, forexample, gas chromatography-mass spectrometry (GC-MS) (see Examplesbelow).

As used herein, an altered product distribution refers to a change inthe relative amount of individual terpenoids produced from the reactionbetween a terpene, such as β-santalene, and a cytochrome P450, such ascytochrome P450 santalene oxidase. Typically, the change is assessed bydetermining the relative amount of individual terpenoids produced fromthe terpene using a first cytochrome P450 (e.g. wild type cytochromeP450) and then comparing it to the relative amount of individualterpenoids produced using a second cytochrome P450 (e.g. a modifiedcytochrome P450). An altered product distribution is considered to occurif the relative amount of any one or more terpenoids is increased ordecreased by at least or by at least about 0.5%, 1%, 2%, 3%, 4%, 5%, 6%,7%, 8%, 9%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 50%, 60%, 70%, 80% ormore.

As used herein, an improved product distribution refers to a change inthe product distribution to one that is more desirable, i.e. containsmore desirable relative amounts of terpenoids. For example, an improvedproduct distribution can contain an increased amount of a desiredterpenoid and/or a decreased amount of a terpenoid that is not sodesired. The amount of desired terpenoid in an improved productiondistribution can be increased by at least or by at least about 0.5%, 1%,2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 50%,60%, 70%, 80% or more. The amount of a terpenoid that is not desired inan improved production distribution can be decreased by at least or byat least about 0.5%, 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 15%, 20%,25%, 30%, 35%, 40%, 50%, 60%, 70%, 80% or more.

As used herein, nucleic acids or nucleic acid molecules include DNA, RNAand analogs thereof, including peptide nucleic acids (PNA) and mixturesthereof. Nucleic acids can be single or double-stranded. When referringto probes or primers, which are optionally labeled, such as with adetectable label, such as a fluorescent or radiolabel, single-strandedmolecules are contemplated. Such molecules are typically of a lengthsuch that their target is statistically unique or of low copy number(typically less than 5, generally less than 3) for probing or priming alibrary. Generally a probe or primer contains at least 14, 16 or 30contiguous nucleotides of sequence complementary to or identical to agene of interest. Probes and primers can be 10, 20, 30, 50, 100 or morenucleic acids long.

As used herein, the term polynucleotide means a single- ordouble-stranded polymer of deoxyribonucleotides or ribonucleotide basesread from the 5′ to the 3′ end. Polynucleotides include RNA and DNA, andcan be isolated from natural sources, synthesized in vitro, or preparedfrom a combination of natural and synthetic molecules. The length of apolynucleotide molecule is given herein in terms of nucleotides(abbreviated “nt”) or base pairs (abbreviated “bp”). The termnucleotides is used for single- and double-stranded molecules where thecontext permits. When the term is applied to double-stranded moleculesit is used to denote overall length and will be understood to beequivalent to the term base pairs. It will be recognized by thoseskilled in the art that the two strands of a double-strandedpolynucleotide can differ slightly in length and that the ends thereofcan be staggered; thus all nucleotides within a double-strandedpolynucleotide molecule cannot be paired. Such unpaired ends will, ingeneral, not exceed 20 nucleotides in length.

As used herein, heterologous nucleic acid is nucleic acid that is notnormally produced in vivo by the cell in which it is expressed or thatis produced by the cell but is at a different locus or expresseddifferently or that mediates or encodes mediators that alter expressionof endogenous nucleic acid, such as DNA, by affecting transcription,translation, or other regulatable biochemical processes. Heterologousnucleic acid is generally not endogenous to the cell into which it isintroduced, but has been obtained from another cell or preparedsynthetically. Heterologous nucleic acid can be endogenous, but isnucleic acid that is expressed from a different locus or altered in itsexpression. Generally, although not necessarily, such nucleic acidencodes RNA and proteins that are not normally produced by the cell orin the same way in the cell in which it is expressed. Heterologousnucleic acid, such as DNA, also can be referred to as foreign nucleicacid, such as DNA. Thus, heterologous nucleic acid or foreign nucleicacid includes a nucleic acid molecule not present in the exactorientation or position as the counterpart nucleic acid molecule, suchas DNA, is found in a genome. It also can refer to a nucleic acidmolecule from another organism or species (i.e., exogenous).

Any nucleic acid, such as DNA, that one of skill in the art wouldrecognize or consider as heterologous or foreign to the cell in whichthe nucleic acid is expressed is herein encompassed by heterologousnucleic acid; heterologous nucleic acid includes exogenously addednucleic acid that also is expressed endogenously. Examples ofheterologous nucleic acid include, but are not limited to, nucleic acidthat encodes traceable marker proteins, such as a protein that confersdrug resistance, nucleic acid that encodes therapeutically effectivesubstances, such as anti-cancer agents, enzymes and hormones, andnucleic acid, such as DNA, that encodes other types of proteins, such asantibodies. Antibodies that are encoded by heterologous nucleic acid canbe secreted or expressed on the surface of the cell in which theheterologous nucleic acid has been introduced.

As used herein, a peptide refers to a polypeptide that is from 2 to 40amino acids in length.

As used herein, the amino acids that occur in the various sequences ofamino acids provided herein are identified according to their known,three-letter or one-letter abbreviations (Table 1). The nucleotideswhich occur in the various nucleic acid fragments are designated withthe standard single-letter designations used routinely in the art.

As used herein, an “amino acid” is an organic compound containing anamino group and a carboxylic acid group. A polypeptide contains two ormore amino acids. For purposes herein, amino acids include the twentynaturally-occurring amino acids, non-natural amino acids and amino acidanalogs (i.e., amino acids in which the α-carbon has a side chain).

In keeping with standard polypeptide nomenclature described in J. Biol.Chem., 243: 3557-3559 (1968), and adopted 37 C.F.R. §§ 1.821-1.822,abbreviations for the amino acid residues are shown in Table 1:

TABLE 1 Table of Correspondence SYMBOL 1-Letter 3-Letter AMINO ACID YTyr Tyrosine G Gly Glycine F Phe Phenylalanine M Met Methionine A AlaAlanine S Ser Serine I Ile Isoleucine L Leu Leucine T Thr Threonine VVal Valine P Pro Proline K Lys Lysine H His Histidine Q Gln Glutamine EGlu Glutamic acid Z Glx Glu and/or Gln W Trp Tryptophan R Arg Arginine DAsp Aspartic acid N Asn Asparagine B Asx Asn and/or Asp C Cys Cysteine XXaa Unknown or other

All amino acid residue sequences represented herein by formulae have aleft to right orientation in the conventional direction ofamino-terminus to carboxyl-terminus. In addition, the phrase “amino acidresidue” is broadly defined to include the amino acids listed in theTable of Correspondence (Table 1) and modified and unusual amino acids,such as those referred to in 37 C.F.R. §§ 1.821-1.822, and incorporatedherein by reference. Furthermore, a dash at the beginning or end of anamino acid residue sequence indicates a peptide bond to a furthersequence of one or more amino acid residues, to an amino-terminal groupsuch as NH₂ or to a carboxyl-terminal group such as COOH.

As used herein, “naturally occurring amino acids” refer to the 20L-amino acids that occur in polypeptides.

As used herein, “non-natural amino acid” refers to an organic compoundcontaining an amino group and a carboxylic acid group that is not one ofthe naturally-occurring amino acids listed in Table 1. Non-naturallyoccurring amino acids thus include, for example, amino acids or analogsof amino acids other than the 20 naturally-occurring amino acids andinclude, but are not limited to, the D-isostereomers of amino acids.Exemplary non-natural amino acids are known to those of skill in the artand can be included in a modified cytochrome P450 polypeptide orcytochrome P450 reductase polypeptide provided herein.

As used herein, modification is in reference to modification of theprimary sequence of amino acids of a polypeptide or a sequence ofnucleotides in a nucleic acid molecule and includes deletions,insertions, and replacements and rearrangements of amino acids andnucleotides. For purposes herein, amino acid replacements (orsubstitutions), deletions and/or insertions, can be made in any of thecytochrome P450s or cytochrome P450 reductases provided herein.Modifications can be made by making conservative amino acid replacementsand also non-conservative amino acid substitutions as well as byinsertions, domain swaps and other such changes in primary sequence. Forexample, amino acid replacements that desirably or advantageously alterproperties of the cytochrome P450 or cytochrome P450 reductase can bemade. For example, amino acid replacements can be made to the cytochromeP450 santalene oxidase such that the resulting modified cytochrome P450santalene oxidase can produce more β-santalol from a mixture ofsantalenes and bergamotenes compared to an unmodified cytochrome P450santalene oxidase. For example, amino acid replacements can be made tothe cytochrome P450 bergamotene oxidase such that the resultingcytochrome P450 bergamotene oxidase can produce more bergamotol from amixture of santalenes and bergamotenes compared to an unmodifiedcytochrome P450 bergamotene oxidase. Modifications also can includepost-translational modifications or other changes to the molecule thatcan occur due to conjugation or linkage, directly or indirectly, toanother moiety, but when such modifications are contemplated they arereferred to as post-translational modifications or conjugates or othersuch term as appropriate. Methods of modifying a polypeptide are routineto those of skill in the art, and can be performed by standard methods,such as site directed mutations, amplification methods, and geneshuffling methods.

As used herein, amino acid replacements or substitutions contemplatedinclude, but are not limited to, conservative substitutions, including,but not limited to, those set forth in Table 2. Suitable conservativesubstitutions of amino acids are known to those of skill in the art andcan be made generally without altering the conformation or activity ofthe polypeptide. Those of skill in this art recognize that, in general,single amino acid substitutions in non-essential regions of apolypeptide do not substantially alter biological activity (see, e.g.,Watson et al. Molecular Biology of the Gene, 4th Edition, 1987, TheBenjamin/Cummings Pub. co., p. 224). Conservative amino acidsubstitutions are made, for example, in accordance with those set forthin Table 2 as follows:

TABLE 2 Original residue Conservative substitution Ala (A) Gly; Ser Arg(R) Lys Asn (N) Gln; His Cys (C) Ser Gln (Q) Asn Glu (E) Asp Gly (G)Ala; Pro His (H) Asn; Gln Ile (I) Leu; Val Leu (L) Ile; Val Lys (K) Arg;Gln; Glu Met (M) Leu; Tyr; Ile Phe (F) Met; Leu; Tyr Ser (S) Thr Thr (T)Ser Trp (W) Tyr Tyr (Y) Trp; Phe Val (V) Ile; Leu; MetOther conservative substitutions also are contemplated and can bedetermined empirically or in accord with known conservativesubstitutions.

As used herein, a DNA construct is a single or double stranded, linearor circular DNA molecule that contains segments of DNA combined andjuxtaposed in a manner not found in nature. DNA constructs exist as aresult of human manipulation, and include clones and other copies ofmanipulated molecules.

As used herein, a DNA segment is a portion of a larger DNA moleculehaving specified attributes. For example, a DNA segment encoding aspecified polypeptide is a portion of a longer DNA molecule, such as aplasmid or plasmid fragment, which, when read from the 5′ to 3′direction, encodes the sequence of amino acids of the specifiedpolypeptide.

As used herein, “primary sequence” refers to the sequence of amino acidresidues in a polypeptide.

As used herein, “similarity” between two proteins or nucleic acidsrefers to the relatedness between the sequence of amino acids of theproteins or the nucleotide sequences of the nucleic acids. Similaritycan be based on the degree of identity and/or homology of sequences ofresidues and the residues contained therein. Methods for assessing thedegree of similarity between proteins or nucleic acids are known tothose of skill in the art. For example, in one method of assessingsequence similarity, two amino acid or nucleotide sequences are alignedin a manner that yields a maximal level of identity between thesequences. “Identity” refers to the extent to which the amino acid ornucleotide sequences are invariant. Alignment of amino acid sequences,and to some extent nucleotide sequences, also can take into accountconservative differences and/or frequent substitutions in amino acids(or nucleotides). Conservative differences are those that preserve thephysico-chemical properties of the residues involved. Alignments can beglobal (alignment of the compared sequences over the entire length ofthe sequences and including all residues) or local (the alignment of aportion of the sequences that includes only the most similar region orregions).

As used herein, “at a position corresponding to” or recitation thatnucleotides or amino acid positions “correspond to” nucleotides or aminoacid positions in a disclosed sequence, such as set forth in theSequence listing, refers to nucleotides or amino acid positionsidentified upon alignment with the disclosed sequence to maximizeidentity using a standard alignment algorithm, such as the GAPalgorithm. For purposes herein, alignment of a cytochrome P450 santaleneoxidase sequence is to the amino acid sequence set forth in SEQ ID NO:7.For purposes herein, alignment of a cytochrome P450 bergamotene oxidasesequence is to the amino acid sequence set forth in any of SEQ ID NOS:6,8 or 9, and in particular SEQ ID NO:6. By aligning the sequences, oneskilled in the art can identify corresponding residues, for example,using conserved and identical amino acid residues as guides. In general,to identify corresponding positions, the sequences of amino acids arealigned so that the highest order match is obtained (see, e.g.:Computational Molecular Biology, Lesk, A. M., ed., Oxford UniversityPress, New York, 1988; Biocomputing: Informatics and Genome Projects,Smith, D. W., ed., Academic Press, New York, 1993; Computer Analysis ofSequence Data, Part I, Griffin, A. M., and Griffin, H. G., eds., HumanaPress, New Jersey, 1994; Sequence Analysis in Molecular Biology, vonHeinje, G., Academic Press, 1987; and Sequence Analysis Primer,Gribskov, M. and Devereux, J., eds., M Stockton Press, New York, 1991;Carillo et al. (1988) SIAM J Applied Math 48:1073). FIGS. 2A-2B, 3A-3C,5A-5B and 21A-21C exemplify exemplary alignments and identification ofexemplary corresponding residues for replacement.

As used herein, “sequence identity” refers to the number of identical orsimilar amino acids or nucleotide bases in a comparison between a testand a reference polypeptide or polynucleotide. Sequence identity can bedetermined by sequence alignment of nucleic acid or protein sequences toidentify regions of similarity or identity. For purposes herein,sequence identity is generally determined by alignment to identifyidentical residues. The alignment can be local or global. Matches,mismatches and gaps can be identified between compared sequences. Gapsare null amino acids or nucleotides inserted between the residues ofaligned sequences so that identical or similar characters are aligned.Generally, there can be internal and terminal gaps. Sequence identitycan be determined by taking into account gaps as the number of identicalresidues/length of the shortest sequence×100. When using gap penalties,sequence identity can be determined with no penalty for end gaps (e.g.terminal gaps are not penalized). Alternatively, sequence identity canbe determined without taking into account gaps as the number ofidentical positions/length of the total aligned sequence×100.

As used herein, a “global alignment” is an alignment that aligns twosequences from beginning to end, aligning each letter in each sequenceonly once. An alignment is produced, regardless of whether or not thereis similarity or identity between the sequences. For example, 50%sequence identity based on “global alignment” means that in an alignmentof the full sequence of two compared sequences each of 100 nucleotidesin length, 50% of the residues are the same. It is understood thatglobal alignment also can be used in determining sequence identity evenwhen the length of the aligned sequences is not the same. Thedifferences in the terminal ends of the sequences will be taken intoaccount in determining sequence identity, unless the “no penalty for endgaps” is selected. Generally, a global alignment is used on sequencesthat share significant similarity over most of their length. Exemplaryalgorithms for performing global alignment include the Needleman-Wunschalgorithm (Needleman et al. (1970) J. Mol. Biol. 48: 443). Exemplaryprograms for performing global alignment are publicly available andinclude the Global Sequence Alignment Tool available at the NationalCenter for Biotechnology Information (NCBI) website (ncbi.nlm.nih.gov/),and the program available atdeepc2.psi.iastate.edu/aat/align/align.html.

As used herein, a “local alignment” is an alignment that aligns twosequence, but only aligns those portions of the sequences that sharesimilarity or identity. Hence, a local alignment determines ifsub-segments of one sequence are present in another sequence. If thereis no similarity, no alignment will be returned. Local alignmentalgorithms include BLAST or Smith-Waterman algorithm (Adv. Appl. Math.2:482 (1981)). For example, 50% sequence identity based on “localalignment” means that in an alignment of the full sequence of twocompared sequences of any length, a region of similarity or identity of100 nucleotides in length has 50% of the residues that are the same inthe region of similarity or identity.

For purposes herein, sequence identity can be determined by standardalignment algorithm programs used with default gap penalties establishedby each supplier or manually. Default parameters for the GAP program caninclude: (1) a unary comparison matrix (containing a value of 1 foridentities and 0 for non identities) and the weighted comparison matrixof Gribskov et al. (1986) Nucl. Acids Res. 14:6745, as described bySchwartz and Dayhoff, eds., Atlas of Protein Sequence and Structure,National Biomedical Research Foundation, pp. 353-358 (1979); (2) apenalty of 3.0 for each gap and an additional 0.10 penalty for eachsymbol in each gap; and (3) no penalty for end gaps. Whether any twonucleic acid molecules have nucleotide sequences or any two polypeptideshave amino acid sequences that are at least 80%, 85%, 90%, 95%, 96%,97%, 98% or 99% “identical,” or other similar variations reciting apercent identity, can be determined using known computer algorithmsbased on local or global alignment (see e.g.,wikipedia.org/wiki/Sequence_alignment_software, providing links todozens of known and publicly available alignment databases andprograms). Generally, for purposes herein sequence identity isdetermined using computer algorithms based on global alignment, such asthe Needleman-Wunsch Global Sequence Alignment tool available fromNCBI/BLAST (blast.ncbi.nlmnih.gov/Blast.cgi?CMD=Web&Page_TYPE=BlastHome); LAlign (William Pearsonimplementing the Huang and Miller algorithm (Adv. Appl. Math. (1991)12:337-357)); and program from Xiaoqui Huang available atdeepc2.psi.iastate.edu/aat/align/align.html. Generally, when comparingnucleotide sequences herein, an alignment with penalty for end gaps isused. Local alignment also can be used when the sequences being comparedare substantially the same length.

As used herein, the term “identity” represents a comparison between atest and a reference polypeptide or polynucleotide. In one non-limitingexample, “at least 90% identical to” refers to percent identities from90 to 100% relative to the reference polypeptides. Identity at a levelof 90% or more is indicative of the fact that, assuming forexemplification purposes a test and reference polypeptide length of 100amino acids are compared, no more than 10% (i.e., 10 out of 100) ofamino acids in the test polypeptide differs from that of the referencepolypeptides. Similar comparisons can be made between a test andreference polynucleotides. Such differences can be represented as pointmutations randomly distributed over the entire length of an amino acidsequence or they can be clustered in one or more locations of varyinglength up to the maximum allowable, e.g., 10/100 amino acid difference(approximately 90% identity). Differences also can be due to deletionsor truncations of amino acid residues. Differences are defined asnucleic acid or amino acid substitutions, insertions or deletions.Depending on the length of the compared sequences, at the level ofhomologies or identities above about 85-90%, the result reasonablyindependent of the program and gap parameters set; such high levels ofidentity can be assessed readily, often without relying on software.

As used herein, the terms “substantially identical” or “similar” varieswith the context as understood by those skilled in the relevant art, butthat those of skill can assess such.

As used herein, an aligned sequence refers to the use of homology(similarity and/or identity) to align corresponding positions in asequence of nucleotides or amino acids. Typically, two or more sequencesthat are related by about or 50% or more identity are aligned. Analigned set of sequences refers to 2 or more sequences that are alignedat corresponding positions and can include aligning sequences derivedfrom RNAs, such as ESTs and other cDNAs, aligned with genomic DNAsequence.

As used herein, substantially pure means sufficiently homogeneous toappear free of readily detectable impurities as determined by standardmethods of analysis, such as thin layer chromatography (TLC), gelelectrophoresis and high performance liquid chromatography (HPLC), usedby those of skill in the art to assess such purity, or sufficiently puresuch that further purification would not detectably alter the physicaland chemical properties, such as enzymatic and biological activities, ofthe substance. Methods for purification of the compounds to producesubstantially chemically pure compounds are known to those of skill inthe art. A substantially chemically pure compound can, however, be amixture of stereoisomers or isomers. In such instances, furtherpurification might increase the specific activity of the compound.

As used herein, isolated or purified polypeptide or protein orbiologically-active portion thereof is substantially free of cellularmaterial or other contaminating proteins from the cell of tissue fromwhich the protein is derived, or substantially free from chemicalprecursors or other chemicals when chemically synthesized. Preparationscan be determined to be substantially free if they appear free ofreadily detectable impurities as determined by standard methods ofanalysis, such as thin layer chromatography (TLC), gel electrophoresisand high performance liquid chromatography (HPLC), used by those ofskill in the art to assess such purity, or sufficiently pure such thatfurther purification would not detectably alter the physical andchemical properties, such as proteolytic and biological activities, ofthe substance. Methods for purification of the compounds to producesubstantially chemically pure compounds are known to those of skill inthe art. A substantially chemically pure compound, however, can be amixture of stereoisomers. In such instances, further purification mightincrease the specific activity of the compound.

As used herein, substantially free of cellular material includespreparations of cytochrome P450s, cytochrome P450 reductases, terpenesor terpenoid products in which the cytochrome P450, cytochrome P450reductase, terpene or terpenoid product is separated from cellularcomponents of the cells from which it is isolated or produced. In oneembodiment, the term substantially free of cellular material includespreparations of cytochrome P450s, cytochrome P450 reductases, terpenesor terpenoid products having less that about or less than 30%, 20%, 10%,5% or less (by dry weight) of non-cytochrome P450s, cytochrome P450reductases, terpenes or terpenoid products, including cell culturemedium. When the cytochrome P450 or cytochrome P450 reductase isrecombinantly produced, it also is substantially free of culture medium,i.e., culture medium represents less than about or at 20%, 10% or 5% ofthe volume of the cytochrome P450 or cytochrome P450 reductasepreparation.

As used herein, the term substantially free of chemical precursors orother chemicals includes preparations of cytochrome P450 or cytochromeP450 reductase proteins in which the protein is separated from chemicalprecursors or other chemicals that are involved in the synthesis of theprotein. The term includes preparations of cytochrome P450 or cytochromeP450 reductase proteins having less than about or less than 30% (by dryweight), 20%, 10%, 5% or less of chemical precursors or non-synthasechemicals or components.

As used herein, synthetic, with reference to, for example, a syntheticnucleic acid molecule or a synthetic gene or a synthetic peptide refersto a nucleic acid molecule or polypeptide molecule that is produced byrecombinant methods and/or by chemical synthesis methods.

As used herein, production by recombinant methods by using recombinantDNA methods refers to the use of the well known methods of molecularbiology for expressing proteins encoded by cloned DNA.

As used herein, vector (or plasmid) refers to discrete DNA elements thatare used to introduce heterologous nucleic acid into cells for eitherexpression or replication thereof. The vectors typically remainepisomal, but can be designed to effect integration of a gene or portionthereof into a chromosome of the genome. Also contemplated are vectorsthat are artificial chromosomes, such as bacterial artificialchromosomes, yeast artificial chromosomes and mammalian artificialchromosomes. Selection and use of such vehicles are well known to thoseof skill in the art.

As used herein, expression refers to the process by which nucleic acidis transcribed into mRNA and translated into peptides, polypeptides, orproteins. If the nucleic acid is derived from genomic DNA, expressioncan, if an appropriate eukaryotic host cell or organism is selected,include processing, such as splicing of the mRNA.

As used herein, an expression vector includes vectors capable ofexpressing DNA that is operatively linked with regulatory sequences,such as promoter regions, that are capable of effecting expression ofsuch DNA fragments. Such additional segments can include promoter andterminator sequences, and optionally can include one or more origins ofreplication, one or more selectable markers, an enhancer, apolyadenylation signal, and the like. Expression vectors are generallyderived from plasmid or viral DNA, or can contain elements of both.Thus, an expression vector refers to a recombinant DNA or RNA construct,such as a plasmid, a phage, recombinant virus or other vector that, uponintroduction into an appropriate host cell, results in expression of thecloned DNA. Appropriate expression vectors are well known to those ofskill in the art and include those that are replicable in eukaryoticcells and/or prokaryotic cells and those that remain episomal or thosewhich integrate into the host cell genome.

As used herein, vector also includes “virus vectors” or “viral vectors.”Viral vectors are engineered viruses that are operatively linked toexogenous genes to transfer (as vehicles or shuttles) the exogenousgenes into cells. Viral vectors include, but are not limited to,adenoviral vectors, retroviral vectors and vaccinia virus vectors.

As used herein, operably or operatively linked when referring to DNAsegments means that the segments are arranged so that they function inconcert for their intended purposes, e.g., transcription initiatesdownstream of the promoter and upstream of any transcribed sequences.The promoter is usually the domain to which the transcriptionalmachinery binds to initiate transcription and proceeds through thecoding segment to the terminator.

As used herein, a “chimeric protein” or “fusion protein” refers to apolypeptide operatively-linked to a different polypeptide. For example,a polypeptide encoded by a nucleic acid sequence containing a codingsequence from one nucleic acid molecule and the coding sequence fromanother nucleic acid molecule in which the coding sequences are in thesame reading frame such that when the fusion construct is transcribedand translated in a host cell, the protein is produced containing thetwo proteins. The two molecules can be adjacent in the construct orseparated by a linker polypeptide that contains, 1, 2, 3, or more, buttypically fewer than 10, 9, 8, 7, or 6 amino acids. The protein productencoded by a fusion construct is referred to as a fusion polypeptide. Achimeric or fusion protein provided herein can include one or moresantalene synthase polypeptides, or a portion thereof, and/or one ormore cytochrome P450 polypeptides, or a portion thereof, and/or one ormore cytochrome P450 reductase polypeptides and/or one or more otherpolypeptides, for any one or more of a transcriptional/translationalcontrol signals, signal sequences, a tag for localization, a tag forpurification, a protein for identification, part of a domain of animmunoglobulin G, and/or a targeting agent. A chimeric cytochrome P450polypeptide or cytochrome P450 reductase polypeptide also includes thosehaving their endogenous domains or regions of the polypeptide exchangedwith another polypeptide. These chimeric or fusion proteins includethose produced by recombinant means as fusion proteins, those producedby chemical means, such as by chemical coupling, through, for example,coupling to sulfhydryl groups, and those produced by any other methodwhereby at least one polypeptide (i.e. cytochrome P450 or cytochromeP450 reductase), or a portion thereof, is linked, directly or indirectlyvia linker(s) to another polypeptide.

As used herein, the term assessing or determining includes quantitativeand qualitative determination in the sense of obtaining an absolutevalue for the activity of a product, and also of obtaining an index,ratio, percentage, visual or other value indicative of the level of theactivity. Assessment can be direct or indirect.

As used herein, recitation that a polypeptide “consists essentially” ofa recited sequence of amino acids means that only the recited portion,or a fragment thereof, of the full-length polypeptide is present. Thepolypeptide can optionally, and generally will, include additional aminoacids from another source or can be inserted into another polypeptide

As used herein, the singular forms “a”, “an” and “the” include pluralreferents unless the context clearly dictates otherwise. Thus, forexample, reference to polypeptide, comprising “an amino acidreplacement” includes polypeptides with one or a plurality of amino acidreplacements.

As used herein, ranges and amounts can be expressed as “about” aparticular value or range. About also includes the exact amount. Hence“about 5%” means “about 5%” and also “5%.”

As used herein, “optional” or “optionally” means that the subsequentlydescribed event or circumstance does or does not occur, and that thedescription includes instances where the event or circumstance occursand instances where it does not. For example, an optional step ofisolating a terpene means that the terpene is isolated or is notisolated, or, an optional stop of isolating a terpenoid means that theterpenoid is isolated or is not isolated.

As used herein, the abbreviations for any protective groups, amino acidsand other compounds, are, unless indicated otherwise, in accord withtheir common usage, recognized abbreviations, or the IUPAC-IUBCommission on Biochemical Nomenclature (see, (1972) Biochem. 11:1726).

For clarity of disclosure, and not by way of limitation, the detaileddescription is divided into the subsections that follow.

B. OVERVIEW

Provided herein are cytochrome P450 enzymes from Santalum album, andvariants and modified forms thereof, for production of santalols andother sesquiterpenoids. Such cytochrome P450s catalyze the biosyntheticproduction of santalols or bergamotols from santalenes and bergamotenes,both of which can be generated biosynthetically from farnesylpyrophosphate by the enzyme santalene synthase (see, WO 2011/000026 andJones et al. (2011) J Biol Chem 286:17445-17454. Also provided hereinare cytochrome P450 reductases from Santalum album, and variants andmodified forms thereof. Also provided herein are methods of makingsantalols and other sesquiterpenoids from farnesyl diphosphate and/orsantalenes and bergamotene. The provided cytochrome P450 enzymes providefor production of these valuable products, including santalols andbergamotols, in commercially useful quantities and in a cost effectiveand energy efficient manner.

1. Biosynthesis of Terpenoids

Terpenes are a large and diverse class of organic compounds that areproduced by a variety of plants from acyclic pyrophosphate isopreneprecursors such as geranyl pyrophosphate (GPP), farnesyl pyrophosphate(FPP), and geranylgeranyl pyrophosphate (GGPP). Terpenes are named basedon the number of isoprene (C₅H₈) units they contain. For example,monoterpenes are derived from GPP and contain 10 carbons, sesquiterpenesare derived from FPP and contain 15 carbons and diterpenes are derivedfrom GGPP and contain 20 carbons. Terpenes that have been chemicallymodified are referred to as terpenoids or isoprenoids. Terpenes andterpenoids are the primary constituents of essential oils of plants andare widely used as flavor additives for food, fragrances in perfumeryand in traditional and alternative medicine.

Santalols and bergamotol are sesquiterpenoids that occur in plants,including the heartwood of Santalum species, including Santalum album(Indian Sandalwood, White Sandalwood, Chandana), Santalumaustrocaledonicum (Australian Sandalwood) and Santalum spicatum.Bergamotol can additionally be found in plants such as orchids.Santalols and bergamotol are the oxidation products of santalenes andbergamotene, respectively. In S. album, about 90% of the essential oilis composed of the sesquiterpene alcohols (Z)-α-, (Z)-β-, and(Z)-epi-β-santalol and (Z)-α-exo-bergamotol. The α- and β-santalols arethe most important contributors to sandalwood oil fragrance.(Z)-α-Santalol and (Z)-β-santalol are the major components of authenticS. album oil.

The P450 enzymes provided herein can be employed to produce thesesquiterpene alcohols important for the sandalwood oil fragrance.Santalenes and bergamotene are synthesized biosynthetically from theacyclic pyrophosphate precursor FPP by the terpene synthase santalenesynthase (see WO 2011/000026 and Jones et al. (2011) J Biol Chem286:17445-17454). Santalene synthase is known to produce a mixture ofsantalenes (i.e. α-, β-, epi-β-santalene and α-exo-bergamotene).Exemplary of santalene synthases are Santalum album santalene synthase(SaSSY) set forth in SEQ ID NO:16 and encoding the amino acid sequenceset forth in SEQ ID NO:17; Santalum austrocaledonicium santalenesynthase (SauSSY, Genbank Accession Nos. HQ343277 or AD087001) set forthin SEQ ID NO:59 and encoding the sequence of amino acids set forth inSEQ ID NO:52; or Santalum spicatum santalene synthase (SspiSSy, GenbankAccession No. HQ343278 or AD087002) set forth in SEQ ID NO: 60 andencoding the sequence of amino acids set forth in SEQ ID NO:53.

The cytochrome P450 oxidase polypeptides provided herein are found tocatalyze the formation of one or more of an α-santalol from α-santalene,β-santalol from β-santalene, epi-β-santalol from epi-β-santalene and/orα-trans-bergamotol from α-trans-bergamotene. Hydroxylation ormonooxygenation of terpene substrates by the cytochrome P450 oxidase isgenerally performed in the presence of a cytochrome reductase. Forexample, Santalum album cytochrome reductases (SaCPR) provided hereinare included in biosynthesis to supply electrons from NADPH to thecytochrome P450. Thus, the pathways for biosynthesis of santalols andbergamotols, including components of sandalwood oil, can bemetabolically engineered in host cells by transforming nucleic acidencoding a cytochrome P450 oxidase and cytochrome P450 reductaseprovided herein in combination with a nucleic acid molecule encoding asantalene synthase.

a. Santalols

In particular, santalols responsible for the fragrance of sandalwood oilinclude α-santalols (1 and 9), β-santalols (2 and 10) andepi-β-santalols (3 and 11) (see FIG. 1). (Z)-α-Santalol (Z-α-santalol;(Z)-5-(1R,2S,6S)-2,3-dimethyltricyclol[2.2.1.0^(2,6)]heptan-3-yl)-2-methylpent-2-en-1-ol; 1)and (E)-α-Santalol((E)-5-((1R,2s,6S)-2,3-dimethyltricyclo[2.2.1.02,6]heptan-3-yl)-2-methylpent-2-en-1-ol;9) are synthesized biosynthetically by oxidation of the sesquiterpeneα-santalene 5 (see FIG. 1). (Z)-β-Santalol (Z-p-santalol;(Z)-2-methyl-5-[(1S,2R,4R)-2-methyl-3-methylene-bicyclo[2.2.1]heptan-2-yl]pent-2-en-1-ol;2) and (E)-β-Santalol((E)-2-methyl-5-((1S,2R,4R)-2-methyl-3-methylenebicyclo[2.2.1]heptan-2-yl)pent-2-en-1-ol;10) are synthesized biosynthetically by oxidation of the sesquiterpeneβ-santalene 6 (see FIG. 1). (E)-epi-β-Santalol((E)-2-methyl-5-[(1R,2R,4S)-2-methyl-3-methylenebicyclo[2.2.1]heptan-2-yl)pent-2-en-1-ol;3) and (Z)-epi-β-Santalol((Z)-2-methyl-5-((1R,2R,4S)-2-methyl-3-methylenebicyclo[2.2.1]heptan-2-yl)pent-2-en-1-ol;11) are synthesized biosynthetically by oxidation of the sesquiterpeneepi-β-santalene 7 (see FIG. 1).

b. Bergamotol

(Z)-α-trans-Bergamotol ((Z)-α-exo-bergamotol; cis-α-trans-bergamotol;(2Z)-5-[(1S,5S,6R)-2,6-dimethylbicyclo[3.3.1]hept-2-en-6-yl]-2-methyl-2-penten-1-ol;4) and (E)-α-trans-Bergamotol ((E)-α-exo-bergamotol;(E)-5-((1S,5S,6R)-2,6-dimethylbicyclo[3.1.1]hept-2-en-6-yl)-2-methylpent-2-en-1-ol;12) are sesquiterpenoids found in sandalwood oil that are synthesizedbiosynthetically by oxidation of the sesquiterpene α-trans-bergamotene 8(see FIG. 1).

2. Cytochrome P450 Enzymes

Cytochromes P450 (CYPs) are a superfamily of hemoproteins, orheme-thiolate proteins, that catalyze the singular insertions of oxygeninto a diverse range of hydrophobic substrates, often with high regio-and stereoselectivity. Cytochrome P450s are ubiquitous proteins thatparticipate in metabolizing a wide range of compounds. As such, P450sare widespread in nature and are involved in processes such asdetoxifying xenobiotics, catabolism of unusual carbon sources andbiosynthesis of secondary metabolites. CYPs are noted for their broadsubstrate specificities and use of oxygen without the need forphosphorylation of adenosine diphosphate (ADP). They can mediatemonooxygenations, hydroxylations at nitrogen and sulfur heteroatoms,epoxidations, dehalogenations, deaminations and dealkylations.Particular reactions catalyzed by CYPs include demethylation,hydroxylation, epoxidation, N-oxidation, sulfooxidation, N-, S-, andO-dealkylations, desulfation, deamination, and reduction of azo, nitro,and N-oxide groups.

Typically, cytochrome P450s are monooxygenases, incorporating one oxygenatom into a substrate. In general, monooxygenations require one or twoadditional proteins to transfer electrons from NAD(P)H to the heme ironand CYPs are placed in groups or classes based on their electrontransfer partner. Class I CYPs, common in bacterial and eukaryoticmitochondrial P450 systems, use a FAD-containing reductase and aniron-sulfer redoxin or ferrodoxin. The FAD-containing reductasetransfers electrons from NAD(P)H to the ferrodoxin which in turn reducesthe CYP. Class II cytochrome P450s are the most common CYPs ineukaryotes and plants, and also include microsomal and bacterial P450systems. Class II CYPs use a NADPH:Cytochrome P450 reductase (orcytochrome P450 reductase) to transfer electrons from NAD(P)H to acytochrome P450. Numerous other classes exist that exploit otherelectron transfer chains.

Cytochrome P450s are named using a systematic nomenclature that includesthe root symbol CYP followed by number designating the family, a letterdesignating the subfamily and a number representing the individual gene,for example, CYP76-G5. Families share greater than 40% amino acidsequence identity and subfamilies share greater than 55% amino acidsequence identity.

Plant cytochrome P450 gene families are very large. For example, totalgenome sequence examination reveals at least 272 predicted cytochromeP450 genes in Arabidopsis and at least 455 unique cytochrome P450 genesin rice (see, e.g., Nelson et al. (2004) Plant Physiol. 135(2):756-772).Plant CYPs can be localized to the endoplasmic reticulum (ER) and tochloroplasts. In plants, CYPs include a wide range of hydroxylases,epoxidases, peroxidases and oxygenases that largely are based upon ClassII monooxygenations. Plant p450s participate in biochemical pathwaysthat include, for example, the synthesis of plant products such asphenylpropanoids, alkaloids, terpenoids, lipids, cyanogenic glycosides,and glucosinolates (see, e.g., Chapple (1998) Annu. Rev. Plant Physiol.Plant Mol. Biol. 49:311-343).

a. Structure

While sequence conservation among cytochrome P450s is relatively low,their general topography and structural fold are highly conserved. Thereare only 3 absolutely conserved residues among all CYPs, namely theglutamic acid and arginine of the ExxR motif (SEQ ID NO:54) and theheme-binding cysteine. Conserved structural nodules are important forstructure and function, and variable regions involved in substraterecognition dictate individual properties (see, e.g., Werck-Reichhartand Feyereisen (2000) Genome Biology 1(6)3000.1-3000.9, Sirim et al.(2010) BMC Structural Biology 10:34 and Baudry et al. (2006) Prot EngDesign & Selection 19:343-353).

Cytochrome P450s typically contain a helices, designated A through L,and β-pleated sheets, designated 1 through 5, contained within a βdomain that is associated with substrate recognition and composedpredominately of β sheets and an a domain that contains the catalyticcenter and is predominantly a helices. The structural regions are asfollows, from N-terminus to C-terminus: helix A, β strand 1-1, β strand1-2, helix B, β strand 1-5, helix B′, helix C, helix C′, helix D, βstrand 3-1, helix E, helix F, helix G, helix H, β strand 5-1, β strand5-2, helix I, helix J, helix J′, helix K, β strand 1-4, β strand 2-1, βstrand 2-2, β strand 1-3, helix K′, helix K″, Heme domain, helix L, βstrand 3-3, β strand 4-1, β strand 4-2 and β strand 3-2 (see, e.g.,Werck-Reichhart and Feyereisen (2000) Genome Biology 1(6)3000.1-3000.9).

Cytochrome P450s are anchored to the endoplasmic reticulum (ER) orchloroplast in plants via a transmembrane helix near the N-terminus ofthe protein (Chapple (1998) Annu Rev Plant Physiol Plant Mol Biol49:311-343). The transmembrane helix is typically followed by hingeregion containing a series of basic amino acid residues and aproline-rich region containing the consensus sequence (P/I)PGPx(G/P)xP(SEQ ID NO:55). This hinge region allows for optimal orientation of theenzyme in relation to the membrane. Deletion of the proline hinge regionresulted in complete loss of activity (Szczesna-Skorupa et al. (1993)Arch Biochem Biophys 304:170-175) and mutation of proline residues toalanine disrupted structure so as to eliminate heme incorporation(Yamazaki et al. (1993) J Biochem 114:652-657).

The conserved CYP core region is composed of a coil termed the‘meander’, a four-helix bundle (helices D, E, I and L), helices J and Kand two sets of β-sheets (Werck-Reichhart and Feyereisen (2000) GenomeBiology 1(6)3000.1-3000.9). The core region contains the heme-bindingloop containing the P450 consensus sequence GRRxCP(A/G) (SEQ ID NO:56)located on the proximal face of the heme just before helix L, with anabsolutely conserved cysteine that serves as the 5th ligand for the hemeiron. The active site for catalysis is the iron-protoporphryin IX (heme)with the thiolate of the conserved cysteine residue as the fifth ligand;the final coordination site is left to bind and activate molecularoxygen (Groves et al., 1995 In Cytochrome P450: Structure, Mechanism,and Biochemistry (Ed: Ortiz de Montellano) Plenum Press, New York, N.Y.,pp. 3-48). The core region also contains the central part of helix Icontaining the threonine-containing binding pocket for the oxygenmolecule required in catalysis having a consensus sequence(A/G)Gx(D/E)T(T/S) (SEQ ID NO:57) which also corresponds to theproton-transfer groove. Finally, the core region contains the absolutelyconserved ExxR motif (SEQ ID NO:54) in helix K on the proximal side ofheme (see, e.g., Werck-Reichhart and Feyereisen (2000) Genome Biology1(6)3000.1-3000.9). The proximal face of the enzyme is involved in redoxpartner recognition and electron transfer to active site. Protons flowinto active site channel from distal face. The substrate access channelis located in close contact with the membrane between the F-G loop, Ahelix and β strands 1-1 and 1-2.

Cytochrome P450 substrate recognitions sites (SRS) are diverse andinclude SRS1, the loop region between B and C helices; SRS2, theC-terminal end of the F helix; SRS3, part of the FG loop and N-terminalend of the G helix; SRS4, helix I containing SRS4 extending over thepyrrole ring B in the active site; SRSS, the loop between the K helixand strand 4 of β-sheet 1; and SRS6, the β turn in β-sheet 4.

b. Function

Cytochrome P450s catalyze regiospecific and stereospecific oxidation ofnon-activated hydrocarbons at physiological temperatures. CytochromeP450s activate molecular oxygen using an iron-heme center and use aredox electron shuttle to support the oxidation reaction. The generalreaction for hydroxylation by the cytochrome P450 system is,

RH+NADPH+H⁺+O₂→ROH+NADP⁺+H₂O,

where R represents a substrate compound. As noted, typically, cytochromeP450s are monooxygenases, catalyzing the insertion of one of the atomsof molecular oxygen into a substrate, with the second oxygen beingreduced to water. Catalysis involves 1) substrate binding; 2)one-electron reduction of the complex to a ferrous state; 3) binding ofmolecular oxygen to give the superoxide complex; and 4) a secondreduction leading to a short lived activated oxygen species. Theactivated oxygen attacks the substrate resulting, typically, inmonooxygenation of the substrate. Other reactions catalyzed by CYPsinclude dealkylation, dehydration, dehydrogenation, isomerization,dimerization, carbon-carbon bond cleavage and reduction.

3. Cytochrome P450 Reductase

Cytochrome P450 reductases (NADPH:cytochrome P450 reductase;NADPH-cytochrome P450 oxidoreductase; NADPH:ferrihemoproteinoxidoreductase; NADPH:P450 oxidoreductase; CPR; CYPOR; EC 1.6.2.4) aremultidomain enzymes of the diflavin reductase family required forelectron transfer from NAD(P)H to cytochrome P450s, heme oxygenases,cytochrome b₅ and squalene epoxidases (Louerat-Orieu et al (1998) Eur JBiochem 258:1040-1049). Plants are known to contain multiple isoforms ofcytochrome P450 reductases (see, Ro et al. (2002) Plant Physiology130:1837-1851; Mizutani and Ohta (1998) Plant Physiology 116:357-367).Generally, at least one CPR is constitutively expressed and the otherCPRs are enhanced by environmental stresses such as UV light andpathogen infection. In addition, plant cytochrome P450 reductases can belocalized to the ER or to the chloroplast, with the location determinedby the corresponding partner cytochrome P450 enzyme.

a. Structure

Cytochrome P450 reductases share amino acid sequence homology (about 30%up to about 90%) among different species, including as bacteria, yeast,fungi, plants, fish, insects and mammals (Louerat-Orieu et al (1998) EurJ Biochem 258:1040-1049). Cytochrome P450 reductases contain twofunctional domains, a hydrophobic N-terminal single α-helical membraneanchoring domain (amino acids 1-95 of SEQ ID NO:12) and a hydrophilicC-terminal catalytic domain (amino acids 96-704 of SEQ ID NO:12) (Wanget al. (1997) Proc Natl Acad Sci USA 94:8111-8416). The N-terminaldomain contains a hydrophobic membrane anchoring domain (amino acids40-60 of SEQ ID NO:12) that anchors the protein to a membrane, forexample, to the ER or chloroplast in plants, thus ensuring the CPR andthe CYP are spatially related to allow for electron transfer. TheN-terminal domain is not necessary for activity, as the C-terminalsoluble domain alone is capable transferring electrons to cytochrome cor other electron acceptors. The C-terminal soluble domain contains twostructural domains, a N-terminal flavin mononucleotide (FMN) domain(amino acids of 101-244 SEQ ID NO:12) and a C-terminal flavin adeninedinucleotide (FAD) domain (amino acids 301-704 of SEQ ID NO:12) (Dym andEisenberg (2001) Protein Science 10:1712-1728). The FMN domain ishomologous to flavodoxin that allows for binding to flavin cofactor FMN.The FAD domain that contains binding domains for flavin cofactor FAD andfor NADPH, and additionally contains residues necessary for catalyticactivity. The FMN and FAD domains are joined by a connecting domain(amino acids 245-300 of SEQ ID NO:12) that is responsible for therelative orientation of the FMN and FAD domains ensuring properalignment of the two flavin cofactors necessary for efficient electrontransfer.

The N-terminal FMN domain has an antiparallel β-structure while theC-terminal NAD(P) subdomain has the typology typical of pyridinedinucleotide-binding folds. The FMN domain contains a five-strandedβ-sheet flanked by five α-helices, with the FMN positioned at theC-terminal side of the β-sheet. The core of the FAD binding domain is ananti-parallel flattened β-barrel and the NADP(H) binding domain is aparallel five-stranded β-sheet flanked by α-helices. The connectingdomain is composed mainly of α-helices. The structural regions are asfollows, from N-terminus to C-terminus: α-helix A; β-strand 1; α-helixB; β-strand 2; α-helix C; β-strand 3; α-helix D; β-strand 4; α-helix E;β-strand 5; α-helix F; β-strand 6; β-strand 7; β-strand 8, β-strand 9;β-strand 10; α-helix G; β-strand 11; β-strand 12; β-strand 12′; α-helixH; α-helix I; α-helix J; α-helix K; α-helix M; β-strand 13; β-strand 14;β-strand 15; α-helix N; β-strand 16; β-strand 16′; β-strand 17; α-helixO; β-strand 18; α-helix P; β-strand 10; α-helix Q; α-helix R; β-strand20; α-helix S; α-helix T; and β-strand 21.

Cytochrome P450 reductases contain conserved cofactor and substratebinding domains, including FMN-, FAD-, NADPH-binding regions andcytochrome c- and cytochrome P450-binding sites. The P450 and cytochromec binding sites contains amino acids 232-240 of SEQ ID NO12. The FMNdomain contains binding regions for the FMN pyrophosphate (amino acids98-119 of SEQ ID NO:12) and the FMN isoalloxazine ring (amino acids161-214 of SEQ ID NO:12). The FAD domain contains binding regions forthe FAD pyrophosphate (amino acids 317-353 of SEQ ID NO:12) and the FADisoalloxazine ring (amino acids 482-505 of SEQ ID NO:12). The FADbinding pocket includes amino acids 344, 482, 484, 485, 500-502, 516-519and 704 of SEQ ID NO:12 and the FAD binding motif includes amino acids482, 484 and 485 of SEQ ID NO:12. The FAD domain also contains bindingregions for the NADPH ribose and pyrophosphate (amino acids 555-576 ofSEQ ID NO:12) and the NADPH nicotinamide (amino acids 651-668 of SEQ IDNO:12). The NADPH binding pocket includes amino acid residues 324, 502,204, 560, 561, 595, 595, 624, 625, 630, 632 634, 659, 663 and 666 of SEQID NO:12. Amino acid residues Ser485, Cys657, Asp702 and Trp704 of SEQID NO:12 are the catalytic residues involved in hydride transfer(Hubbard et al. (2001) J Biol Chem 276:29163-29170). Amino acid residues516, 519 and 522 of SEQ ID NO:12 are involved in the phosphate bindingmotif (Dym and Eisenberg (2001) Protein Science 10:1712-1728). The βαβstructure motif is formed from amino acid residues 557, 560-563 and 565of SEQ ID NO:12 (Dym and Eisenberg (2001) Protein Science 10:1712-1728).

b. Function

Cytochrome P450 reductases shuttle two electrons from NAD(P)H tocytochrome P450 through the flavin cofactors FAD and FMN. FAD receives ahydride anion from the two electron donor NAD(P)H and passes theelectrons one at a time to FMN. FMN then donates the electrons to thecytochrome P450. Cytochrome P450 uses the electrons, as described above,for the hydroxylation of various substrates.

C. CYTOCHROME P450 POLYPEPTIDES AND NUCLEIC ACID MOLECULES ENCODING THECYTOCHROME P450 POLYPEPTIDES

Provided herein are cytochrome P450 polypeptides, including cytochromeP450 santalene oxidase polypeptides and cytochrome P450 bergamoteneoxidase polypeptides. Also provided herein are nucleic acid moleculesthat encode any of the cytochrome P450 polypeptides provided herein. Thecytochrome P450 santalene oxidase polypeptides provided herein catalyzethe formation of α-santalol, β-santalol or epi-β-santalol fromα-santalene, β-santalene or epi-β-santalene, respectively, including,the production of β-santalol from β-santalene. The cytochrome P450santalene oxidase polypeptides provided herein are also capable ofcatalyzing the formation of α-trans-bergamotol from α-trans-bergamotene.In some examples, the nucleic acid molecules that encode the cytochromeP450 santalene oxidase polypeptides are those that are the same as orsubstantially the same as those that are isolated from the sandalwoodtree Santalum album. In other example, the nucleic acid molecules andencoded cytochrome P450 santalene oxidase polypeptides are variants ofthose isolated from the sandalwood tree Santalum album. The cytochromeP450 bergamotene oxidase polypeptides provided herein catalyze theformation of α-trans-bergamotol from α-trans-bergamotene. In someexamples, the nucleic acid molecules that encode the cytochrome P450bergamotene oxidase polypeptides are those that are the same as thosethat are isolated from the sandalwood tree Santalum album. In otherexamples, the nucleic acid molecules and encoded cytochrome P450bergamotene oxidase polypeptides are variants of those isolated from thesandalwood tree Santalum album.

Also provided herein are modified cytochrome P450 polypeptides andnucleic acid molecules that encode any of the modified cytochrome P450polypeptides provided herein. The modifications can be made in anyregion of a cytochrome P450 polypeptide, including a cytochrome P450santalene oxidase polypeptide or cytochrome P450 bergamotene oxidasepolypeptide, provided the resulting modified cytochrome P450 polypeptideretains at least retains the catalytic activity of the unmodifiedcytochrome P450 polypeptide. For example, modifications can be made to acytochrome P450 santalene oxidase polypeptide provided the resultingmodified cytochrome P450 santalene oxidase polypeptide retainscytochrome P450 santalene oxidase activity (i.e., the ability tocatalyze the hydroxylation of a santalene, namely α-santalene,β-santalene or epi-β-santalene). In another example, modifications canbe made to a cytochrome P450 bergamotene oxidase polypeptide providedthe resulting modified cytochrome P450 bergamotene oxidase polypeptideretains cytochrome P450 bergamotene oxidase activity (i.e., the abilityto catalyze the hydroxylation of a bergamotene, namelyα-trans-bergamotene).

The modifications can include, but are not limited to, codonoptimization of the nucleic acids and/or changes that results in asingle amino acid modification in the encoded polypeptide, such assingle or multiple amino acid replacements (substitutions), insertionsor deletions, or multiple amino acid modifications, such as multipleamino acid replacements, insertions or deletions, including swaps ofregions or domains of the polypeptide. In some examples, entire orpartial domains or regions, such as any domain or region describedherein below, are exchanged with corresponding domains or regions orportions thereof from another cytochrome P450 polypeptide. Exemplary ofmodifications are amino acid replacements, including single or multipleamino acid replacements. For example, modified cytochrome P450polypeptides provided herein can contain at least or 1, 2, 3, 4, 5, 6,7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25,26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43,44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61,62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79,80, 81, 82, 83, 84, 85, 90, 95, 100, 105, 110, 115, 120 or more modifiedpositions compared to the cytochrome P450 polypeptide not containing themodification.

Provided herein are cytochrome P450 polypeptides from the CYP76 family.Provided herein is a CYP76 cytochrome P450 polypeptide having a sequenceof amino acids set forth in SEQ ID NO:50. Also provided herein arecytochrome P450 polypeptides that exhibit at least 60% amino acidsequence identity to a cytochrome P450 polypeptide set forth in SEQ IDNO:50. For example, the cytochrome P450 polypeptides provided herein canexhibit at least or at least about 65%, 70%, 75%, 80%, 81%, 82%, 83%,84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%,98% or 99% amino acid sequence identity to a cytochrome P450 polypeptideset forth in SEQ ID NO:50, providing the resulting cytochrome P450polypeptide at least retains cytochrome P450 monooxygenase activity(i.e., the ability to catalyze the hydroxylation or monooxygenation of aterpene). Also provided herein are modified cytochrome P450 polypeptidesfrom the CYP76 family. In particular, modified cytochrome P450polypeptides provided herein contain amino acid replacements orsubstitutions, additions or deletions, truncations or combinationsthereof with reference to the cytochrome P450 polypeptide having asequence of amino acids set forth in SEQ ID NO:50. It is within thelevel of one of skill in the art to make such modifications incytochrome P450 polypeptides or any variant thereof and test each forcytochrome P450 activity described herein, such as monooxygenaseactivity.

Also provided herein are CYP76 nucleic acid molecules that have asequence of amino acids set forth in SEQ ID NO:1, or degeneratesthereof, that encode a cytochrome P450 polypeptide having a sequence ofamino acids set forth in SEQ ID NO:50. The CYP76 nucleic acid moleculeset forth in SEQ ID NO:1 can be used to design primers that are used toidentify and/or clone additional CYP proteins. Also provided herein arenucleic acid molecules encoding a cytochrome P450 polypeptide having atleast 85% sequence identity to a sequence of nucleotides set forth inSEQ ID NO:1. For example, the nucleic acid molecules provided herein canexhibit at least or about at least 85%, 86%, 87%, 88%, 89%, 90%, 91%,92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% or more sequence identity to asequence of nucleotides set forth in SEQ ID NO:1, so long as the encodedcytochrome P450 polypeptide at least retains cytochrome P450monooxygenase activity (i.e., the ability to catalyze the hydroxylationof a terpene). Also provided herein are degenerate sequences of thesequence set forth in SEQ ID NO:1 encoding a cytochrome P450 polypeptidehaving a sequence of amino acids set forth in SEQ ID NO:50. Percentidentity can be determined by one skilled in the art using standardalignment programs.

Provided herein are cytochrome P450 SaCYP76F39v1 (CYP76-G10), SaCYP76F42(CYP76-G13), SaCYP76F39v2 (CYP76-G15), SaCYP76F40 (CYP76-G16) andSaCYP76F41 (CYP76-G17) polypeptides. Provided herein are cytochrome P450santalene oxidase polypeptides having a sequence of amino acids setforth in SEQ ID NO:7, 74, 75, 76 or 77. Also provided herein arecytochrome P450 santalene oxidase polypeptides that exhibit at least 60%amino acid sequence identity to a cytochrome P450 santalene oxidasepolypeptide set forth in any of SEQ ID NOS:7, 74, 75, 76 or 77. Forexample, the cytochrome P450 santalene oxidase polypeptides providedherein can exhibit at least or at least about 65%, 70%, 75%, 80%, 81%,82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%,96%, 97%, 98% or 99% amino acid sequence identity to a cytochrome P450santalene oxidase polypeptide set forth in SEQ ID NO: 7, 73, 74, 75 or76, providing the resulting cytochrome P450 santalene oxidasepolypeptides at least retain cytochrome P450 santalene oxidase activity(i.e., the ability to catalyze the hydroxylation of a santalene, namelyα-santalene, β-santalene or epi-β-santalene). Percent identity can bedetermined by one skilled in the art using standard alignment programs.

Provided herein are cytochrome P450 SaCYP76F38v1 (CYP76-G5),SaCYP76F37v1 (CYP76-G11), SaCYP76F38v2 (CYP76-G12) and SaCYP76F37v2(CYP76-G14) polypeptides. Provided herein are cytochrome P450bergamotene oxidase polypeptides having a sequence of amino acids setforth in SEQ ID NO:6, 8, 9 or 73. Also provided herein are cytochromeP450 bergamotene oxidase polypeptides that exhibit at least 60% aminoacid sequence identity to a cytochrome P450 bergamotene oxidasepolypeptide set forth in SEQ ID NO:6, 8, 9 or 73. For example, thecytochrome P450 bergamotene oxidase polypeptides provided herein canexhibit at least or at least about 65%, 70%, 75%, 80%, 81%, 82%, 83%,84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%,98% or 99% amino acid sequence identity to a cytochrome P450 bergamoteneoxidase polypeptide set forth in SEQ ID NO:6, 8, 9 or 73, providing theresulting cytochrome P450 bergamotene oxidase polypeptide at leastretains cytochrome P450 bergamotene oxidase activity (i.e., the abilityto catalyze the hydroxylation of a bergamotene). Percent identity can bedetermined by one skilled in the art using standard alignment programs.

Also provided herein is cytochrome P450 SaCYP76F43 (CYP76-G18)polypeptide. Provided herein is a cytochrome P450 polypeptide having asequence of amino acids set forth in SEQ ID NO:78. Also provided hereinare cytochrome P450 polypeptides that exhibit at least 60% amino acidsequence identity to a cytochrome P450 polypeptide set forth in SEQ IDNO:78. For example, the cytochrome P450 polypeptides provided herein canexhibit at least or at least about 65%, 70%, 75%, 80%, 81%, 82%, 83%,84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%,98% or 99% amino acid sequence identity to a cytochrome P450 polypeptideset forth in SEQ ID NO:78, providing the resulting cytochrome P450polypeptide at least retains cytochrome P450 monooxygenase activity(i.e., the ability to catalyze the hydroxylation or monooxygenation of aterpene). In particular, modified cytochrome P450 polypeptides providedherein contain amino acid replacements or substitutions, additions ordeletions, truncations or combinations thereof with reference to thecytochrome P450 polypeptide having a sequence of amino acids set forthin SEQ ID NO:78. It is within the level of one of skill in the art tomake such modifications in cytochrome P450 polypeptides or any variantthereof and test each for cytochrome P450 activity described herein,such as monooxygenase activity.

Also, in some examples, provided herein are catalytically activefragments of cytochrome P450 polypeptides. In some examples, the activefragments of the cytochrome P450 polypeptides, including the cytochromeP450 santalene oxidase or cytochrome P450 bergamotene oxidasepolypeptides, are modified as described above. Such fragments retain oneor more properties of a full-length cytochrome P450 polypeptide,including full-length santalene oxidase or cytochrome P450 bergamoteneoxidase polypeptides. Typically, the active fragments exhibit cytochromeP450 santalene oxidase or cytochrome P450 bergamotene oxidase activity(i.e., catalyze the formation of santalols and bergamotols,respectively).

The cytochrome P450s provided herein, including the cytochrome P450santalene oxidase or cytochrome P450 bergamotene oxidase polypeptidesprovided herein, can contain other modifications, for example,modifications not in the primary sequence of the polypeptide, includingpost-translational modifications. For example, modification describedherein can be a cytochrome P450 santalene oxidase or cytochrome P450bergamotene oxidase that is a fusion polypeptide or chimericpolypeptide, including hybrids of different cytochrome P450 santaleneoxidase or cytochrome P450 bergamotene oxidase polypeptides or differentcytochrome P450 monooxygenases (e.g. contain one or more domains orregions from another cytochrome P450 monooxygenases) and also syntheticcytochrome P450 santalene oxidase or cytochrome P450 bergamotene oxidasepolypeptides prepared recombinantly or synthesized or constructed byother methods known in the art based upon the sequence of knownpolypeptides.

The cytochrome P450 santalene oxidase polypeptides or cytochrome P450bergamotene oxidase polypeptides provided herein can be used to catalyzethe production of santalols and bergamotols, respectively. Typically,the cytochrome P450 santalene oxidase polypeptides provided hereincatalyze the formation of santalols from santalenes, e.g., they catalyzethe hydroxylation of santalenes. In some examples, the cytochrome P450santalene oxidases also catalyze the formation of bergamotols frombergamotenes. Typically the cytochrome P450 bergamotene oxidasepolypeptides provided herein catalyze the formation of bergamotol frombergamotene, e.g., they catalyze the hydroxylation of bergamotene.Reactions can be performed in vivo, such as in a host cell into whichthe nucleic acid has been introduced. At least one of the polypeptideswill be heterologous to the host. Reactions also can be performed invitro by contacting with enzyme with the appropriate substrate underappropriate conditions.

Also provided herein are nucleic acid molecules encoding a santalenesynthase and a cytochrome P450 santalene oxidase. Also provided hereinare nucleic acid molecules encoding a santalene synthase and acytochrome P450 bergamotene oxidase. In such examples, expression of thenucleic acid molecule in a suitable host, for example, a bacterial oryeast cell, results in expression of santalene synthase and thecytochrome P450 oxidase. Such cells can be used to produce the santalenesynthases and the cytochrome P450 oxidases and/or to perform reactionsin vivo to produce santalols and bergamotols. For example, santalols andbergamotols can be generated in a host cell from farnesyl diphosphate(FPP), particularly a yeast cell that overproduces the acyclic terpeneprecursor FPP. In some examples, a nucleic acid molecule encoding afarnesyl diphosphate synthase, such as a Santalum album farnesyldiphosphate synthase, can also be expressed in the suitable host, forexample, a bacterial or yeast cell, resulting in over-expression of FPP.

Also provided herein are nucleic acid molecules encoding a santalenesynthase, cytochrome P450 polypeptide and a cytochrome P450 reductasepolypeptide. For example, provided herein are nucleic acid moleculesencoding a santalene synthase, cytochrome P450 santalene oxidasepolypeptide and a cytochrome P450 reductase polypeptide. In anotherexample, provided herein are nucleic acid molecules encoding a santalenesynthase, cytochrome P450 bergamotene oxidase polypeptide and acytochrome P450 reductase polypeptide. The nucleic acid molecules can bein the same vector or plasmid or on different vectors or plasmids. Insuch examples, expression of the nucleic acid molecule in a suitablehost, for example, a bacterial or yeast cell, results in expression ofsantalene synthase and the cytochrome P450 oxidase. Such cells can beused to produce the santalene synthases and the cytochrome P450 oxidasesand/or to perform reactions in vivo to produce santalols andbergamotols. For example, santalols and bergamotols can be generated ina host cell from farnesyl diphosphate (FPP), particularly a yeast cellthat overproduces the acyclic terpene precursor FPP.

1. Cytochrome P450 Santalene Oxidase Polypeptides

Provided herein are cytochrome P450 santalene oxidase polypeptides. Alsoprovided herein are nucleic acid molecules that encode any of thecytochrome P450 santalene oxidase polypeptides provided herein. Thecytochrome P450 santalene oxidase polypeptides provided herein catalyzethe formation of catalyze the formation of terpenoids found insandalwood oil, including α-santalols, β-santalols, epi-β-santalols andα-trans-bergamotols. The cytochrome P450 santalene oxidase polypeptidesprovided herein catalyze the formation of santalols from santalenes. Insome examples, the cytochrome P450 santalene oxidase polypeptidesprovided herein also catalyze the formation of bergamotols frombergamotene. For example, the cytochrome P450 santalene oxidasepolypeptides catalyze the formation of α-santalol from α-santalene,β-santalol from β-santalene and/or epi-β-santalol from epi-β-santalene(e.g., the cytochrome P450 santalene oxidase polypeptides catalyze thehydroxylation of α-santalene, β-santalene and/or epi-β-santalene). In aparticular example, the cytochrome P450 santalene oxidase polypeptidescatalyze the formation of (E)-α-santalol from α-santalene,(Z)-α-santalol from α-santalene, (E)-β-santalol from β-santalene,(Z)-β-santalol from β-santalene, (E)-epi-β-santalol from epi-β-santaleneand/or (Z)-epi-β-santalol from epi-β-santalene. In some examples, thecytochrome P450 santalene oxidase polypeptides provided herein alsocatalyze the formation of (Z)-α-trans-bergamotol and/or(E)-α-trans-bergamotol from α-trans-bergamotene. In a particularexample, the cytochrome P450 santalene oxidase polypeptides providedherein catalyze the formation of (E)-α-santalol, (Z)-α-santalol,(E)-β-santalol, (Z)-β-santalol, (E)-epi-β-santalol, (Z)-epi-β-santalol,(Z)-α-trans-bergamotol and/or (E)-α-trans-bergamotol. In particular, thecytochrome P450 santalene oxidase polypeptides produce (Z) and (E)stereoisomers of α- and β-santalol in ratios of approximately 1:5 and1:4, respectively. The cytochrome P450 santalene oxidase polypeptidesexhibit narrow substrate specificity, preferring α-santalene orβ-santalene. In some examples, the cytochrome P450 santalene oxidasepolypeptides also converted the substrate α-bisabolol.

In some examples, the cytochrome P450 santalene oxidase polypeptidesprovided herein catalyze the formation of terpenoids found in sandalwoodoil, including α-santalol, β-santalol, epi-β-santalol andα-trans-bergamotol, from the terpene reaction products of the acyclicprecursor farnesyl pyrophosphate and a santalene synthase. For example,the cytochrome P450 santalene oxidase polypeptides provided hereincatalyze the formation of (E)-α-santalol, (Z)-α-santalol,(E)-β-santalol, (Z)-β-santalol, (E)-epi-β-santalol, (Z)-epi-β-santalol,(Z)-α-trans-bergamotol and/or (E)-α-trans-bergamotol from the terpenereaction products of the acyclic precursor FPP and a santalene synthase,such as Santalum album santalene synthase (SaSSY; SEQ ID NO:16). Thecytochrome P450 santalene oxidase polypeptides catalyze the formation of(E)-α-santalol, (Z)-α-santalol, (E)-β-santalol, (Z)-β-santalol,(E)-epi-β-santalol, (Z)-epi-β-santalol, (Z)-α-trans-bergamotol and/or(E)-α-trans-bergamotol in different ratios from those of authenticsandalwood oil (see Example 11 and FIGS. 15A and 15B). For example, themain products formed with SaCYP76F39v1 (SaCYP76-G10) were (E)-α-santaloland (E)-β-santalol while the main compounds of sandalwood oil are(Z)-α-santalol and (Z)-β-santalol (see FIGS. 15A and 15B).

For example, provided herein are cytochrome P450 santalene oxidasepolypeptides that have a sequence of amino acids set forth in any of SEQID NOS:7, 74, 75, 76 and 77. Also provided herein are cytochrome P450santalene oxidase polypeptides that exhibit at least 60% amino acidsequence identity to a cytochrome P450 santalene oxidase polypeptidehaving a sequence of amino acids set forth in any of SEQ ID NOS:7, 74,75, 76 and 77. For example, the cytochrome P450 santalene oxidasepolypeptides provided herein can exhibit at least at or about or 65%,70%, 75%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%,92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% or more amino acid sequenceidentity to a cytochrome P450 santalene oxidase polypeptide set in anyof SEQ ID NOS:7, 74, 75, 76 and 77, provided the cytochrome P450santalene oxidase polypeptides exhibit cytochrome P450 santalene oxidaseactivity (i.e. catalyze the formation of santalols from santalenesand/or bergamotols from bergamotene). Percent identity can be determinedby one skilled in the art using standard alignment programs.

Provided herein are cytochrome P450 santalene oxidases designatedSaCYP76F39v1 (CYP76-G10), SaCYP76F39v2 (CYP76-G15), SaCYP76F40(CYP76-G16), SaCYP76F41 (CYP76-G17) and SaCYP76F42 (CYP76-G13) that havea sequence of amino acids set forth in SEQ ID NOS:7, 74, 75, 76 and 77,respectively. Also provided herein are active fragments of cytochromeP450 santalene oxidase polypeptides having a sequence of amino acids setforth in any of SEQ ID NO:7, 74, 75, 76 and 77. Such fragments retainone or more properties of a cytochrome P450 santalene oxidasepolypeptide. Typically, the active fragments exhibit cytochrome P450santalene oxidase activity (i.e. the ability to catalyze the formationof santalols from santalenes).

Also provided herein are nucleic acid molecules that have a sequence ofamino acids set forth in any of SEQ ID NOS:3, 68, 69, 70 and 71, ordegenerates thereof, that encode a cytochrome P450 santalene oxidasepolypeptide having a sequence of amino acids set forth in SEQ ID NOS:7,74, 75, 76 and 77, respectively. Also provided herein are nucleic acidmolecules encoding cytochrome P450 santalene oxidase polypeptides havingat least 85% sequence identity to a sequence of nucleotides set forth inany of SEQ ID NOS:3, 68, 69, 70 and 71. For example, the nucleic acidmolecules provided herein can exhibit at least or about at least 85%,86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 95%, 95%, 96%, 97%, 98% or 99%or more sequence identity to a sequence of nucleotides set forth in anyof SEQ ID NOS:3, 68, 69, 70 and 71, so long as the encoded cytochromeP450 santalene oxidase polypeptides exhibits cytochrome P450 santaleneoxidase activity (i.e. the ability to catalyze the formation ofsantalols from santalenes). Also provided herein are degeneratesequences of the sequence set forth in any of SEQ ID NOS:3, 68, 69, 70and 71 encoding cytochrome P450 santalene oxidase polypeptides having asequence of amino acids set forth in SEQ ID NO:7, 74, 75, 76 and 77,respectively. Percent identity can be determined by one skilled in theart using standard alignment programs.

In some examples, the nucleic acid molecules that encode the cytochromeP450 santalene oxidase polypeptides are isolated from the sandalwoodtree Santalum album. In other examples, the nucleic acid molecules andencoded cytochrome P450 santalene oxidase polypeptides are variants ofthose isolated from the sandalwood tree Santalum album.

In a particular example, the SaCYP76F39v1 (CYP76-G10) polypeptide havinga sequence of amino acids set forth in SEQ ID NO:7 catalyzed theformation of (E)-α-santalol, (Z)-α-santalol, (E)-β-santalol,(Z)-β-santalol, (E)-epi-β-santalol, (Z)-epi-β-santalol,(Z)-α-trans-bergamotol and (E)-α-trans-bergamotol in in vivo assays inyeast expressing a santalene synthase (see Example 10.B.2) and in invitro assays with a mixture of α-santalene, α-trans-bergamotene,epi-β-santalene and β-santalene as the substrate (see Example11.B.2.a.ii). In in vivo assays, (E)-β-santalol, (E)-α-santalol and(Z)-β-santalol were the major products (see FIG. 11A). In in vitroassays, (E)-β-santalol and (E)-α-santalol were the major products (seeFIG. 15A). In yet other examples, in in vitro assays with eitherα-santalene, α-trans-bergamotene, or epi-β-santalene and β-santalene,the SaCYP76F39v1 (CYP76-G10) polypeptide catalyzed the formation of (Z)-and (E)-α-santalol, (Z)- and (E)-α-trans-bergamotol, and (Z)- and(E)-epi-β-santalol and (Z)- and (E)-β-santalol, respectively (seeExample 11.C. and FIGS. 20A-20C). The kinetic properties of theSaCYP76F39v1 (CYP76-G10) polypeptide for α- and β-santalene assubstrates are described in Example 12 below.

In another example, the SaCYP76F39v2 (CYP76-G15) polypeptide having asequence of amino acids set forth in SEQ ID NO:74 catalyzed theformation of (E)-α-santalol, (Z)-α-santalol, (E)-β-santalol,(Z)-β-santalol, (E)-epi-β-santalol, (Z)-epi-β-santalol,(Z)-α-trans-bergamotol and (E)-α-trans-bergamotol in in vivo assays inyeast expressing a santalene synthase (see Example 10.B.3 and FIG. 13A).In in vitro assays with a mixture of α-santalene, α-trans-bergamotene,epi-β-santalene and β-santalene as the substrate, the SaCYP76F39v2(CYP76-G15) polypeptide catalyzed the formation of (E)-α-santalol,(Z)-α-santalol, (E)-β-santalol, (Z)-β-santalol, (E)-epi-β-santalol,(Z)-epi-β-santalol, (Z)-α-trans-bergamotol and (E)-α-trans-bergamotol(see Example 11.B.3.b and FIG. 16A) with (E)-α-santalol and(E)-β-santalol as the major products.

In another example, the SaCYP76F40 (CYP76-G16) polypeptide having asequence of amino acids set forth in SEQ ID NO:75 catalyzed theformation of (E)-α-santalol, (Z)-α-santalol, (E)-β-santalol,(Z)-β-santalol, (E)-epi-β-santalol, (Z)-α-trans-bergamotol and(E)-α-trans-bergamotol in in vivo assays in yeast expressing a santalenesynthase (see Example 10.B.3 and FIG. 13B). In in vitro assays with amixture of α-santalene, α-trans-bergamotene, epi-β-santalene andβ-santalene as the substrate, the SaCYP76F40 (CYP76-G16) polypeptidecatalyzed the formation of (E)-α-santalol, (E)-β-santalol,(Z)-β-santalol, (Z)-α-trans-bergamotol and (E)-α-trans-bergamotol (seeExample 11.B.3.b and FIG. 16B) with (E)-α-trans-bergamotol and(E)-β-santalol as the major products.

In another example, the SaCYP76F41 (CYP76-G17) polypeptide having asequence of amino acids set forth in SEQ ID NO:76 catalyzed theformation of (E)-α-santalol, (Z)-α-santalol, (E)-β-santalol,(Z)-β-santalol, (E)-epi-β-santalol, (Z)-epi-β-santalol and(E)-α-trans-bergamotol in in vivo assays in yeast expressing a santalenesynthase (see Example 10.B.3 and FIG. 13C). In in vitro assays with amixture of α-santalene, α-trans-bergamotene, epi-β-santalene andβ-santalene as the substrate, the SaCYP76F41 (CYP76-G17) polypeptidecatalyzed the formation of (E)-α-santalol, (Z)-α-santalol,(E)-β-santalol, (Z)-β-santalol, (E)-epi-β-santalol, (Z)-epi-β-santalol,(Z)-α-trans-bergamotol and (E)-α-trans-bergamotol (see Example 11.B.3.band FIG. 16C) with (E)-α-santalol as the major product.

In another example, the SaCYP76F42 (CYP76-G13) polypeptide having asequence of amino acids set forth in SEQ ID NO:77 catalyzed theformation of (Z)-α-santalol, (Z)-β-santalol, (E)-epi-β-santalol and(E)-α-trans-bergamotol in in vivo assays in yeast expressing a santalenesynthase (see Example 10.B.3 and FIG. 13D). In in vitro assays with amixture of α-santalene, α-trans-bergamotene, epi-β-santalene andβ-santalene as the substrate, the SaCYP76F42 (CYP76-G13) polypeptidecatalyzed the formation of (E)-α-santalol, (Z)-α-santalol,(E)-β-santalol, (Z)-β-santalol, (E)-epi-β-santalol, (Z)-epi-β-santalol,(Z)-α-trans-bergamotol and (E)-α-trans-bergamotol (see Example 11.B.3.band FIG. 16D) with (E)-α-trans-bergamotol as the major product.

Modified Cytochrome P450 Santalene Oxidase Polypeptides

Also provided herein are modified cytochrome P450 santalene oxidasepolypeptides. The modifications, which typically are amino acidinsertions, deletions and/or substitutions, can be effected in anyregion of a cytochrome P450 santalene oxidase polypeptide provided theresulting modified cytochrome P450 santalene oxidase polypeptides atleast retain cytochrome P450 santalene oxidase activity. For example,modifications can be made in any region of a cytochrome P450 santaleneoxidase provided the resulting modified cytochrome P450 santaleneoxidase at least retains cytochrome P450 santalene oxidase activity(i.e., the ability to catalyze the formation of santalols fromsantalenes). The modifications can be a single amino acid modification,such as single amino acid replacements (substitutions), insertions ordeletions, or multiple amino acid modifications, such as multiple aminoacid replacements, insertions or deletions. In some examples, entire orpartial domains or regions, such as any domain or region describedherein below, are exchanged with corresponding domains or regions orportions thereof from another cytochrome P450 polypeptide. Exemplary ofmodifications are amino acid replacements, including single or multipleamino acid replacements. For example, modified cytochrome P450 santaleneoxidase polypeptides provided herein can contain at least or 1, 2, 3, 4,5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23,24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41,42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59,60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77,78, 79, 80, 81, 82, 83, 84, 85, 90, 95, 100, 105, 110, 115, 120 or moremodified positions compared to the cytochrome P450 santalene oxidasepolypeptide not containing the modification. For example, themodifications described herein can be in a cytochrome P450 santaleneoxidase polypeptide having a sequence of amino acids set forth in any ofSEQ ID NOS:7, 74, 75, 76 and 77 or any variant thereof, including anythat have at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%,94%, 95%, 96%, 97%, 98%, or 99% sequence identity to a cytochrome P450santalene oxidase polypeptide set forth in any of SEQ ID NOS:7, 74, 75,76 or 77. Based on this description, it is within the level of one ofskill in the art to generate a cytochrome P450 santalene oxidasepolypeptide containing any one or more of the described mutations, andtest each for cytochrome P450 santalene oxidase activity describedherein.

Also, in some examples, provided herein are modified active fragments ofcytochrome P450 santalene oxidase polypeptides, that contain any of themodifications provided herein. Such fragments retain on or moreproperties of a cytochrome P450 santalene oxidase. Typically, thecytochrome P450 santalene oxidase polypeptides exhibit santalene oxidase(i.e., the ability to hydrolyze santalene and/or bergamotene).

Modifications in a cytochrome P450 santalene oxidase also can be made toa cytochrome P450 santalene oxidase polypeptide that also contains othermodifications, including modifications of the primary sequence andmodifications not in the primary sequence of the polypeptide. Forexample, modification described herein can be in a cytochrome P450santalene oxidase polypeptide that is a fusion polypeptide or chimericpolypeptide, including hybrids of different cytochrome P450 santaleneoxidase polypeptides with different cytochrome P450 polypeptides (e.g.contain one or more domains or regions from another cytochrome P450s)and also synthetic cytochrome P450 santalene oxidase polypeptidesprepared recombinantly or synthesized or constructed by other methodsknown in the art based upon the sequence of known polypeptides.

In some examples, the modifications are amino acid replacements. Infurther examples, the modified cytochrome P450 santalene oxidasepolypeptides provided herein contain one or more modifications in adomain. As described elsewhere herein, the modifications in a domain orstructural domain can be by replacement of corresponding heterologousresidues from another cytochrome P450 polypeptide.

To retain cytochrome P450 santalene oxidase activity, modificationstypically are not made at those positions necessary for cytochrome P450santalene oxidase activity, i.e., in the catalytic center or inconserved residues. For example, generally modifications are not made aposition corresponding to Glu367, Arg370, Gly445, Arg446, Arg447,Ile448, Cys449, Pro450 or Gly451 with reference to a sequence of aminoacids set forth in any of SEQ ID NOS:7, 74, 75, 76 or 77.

The modified cytochrome P450 santalene oxidase polypeptides can containtwo or more modifications, including amino acid replacements orsubstitutions, insertions or deletions, truncations or combinationsthereof. Generally, multiple modifications provided herein can becombined by one of skill in the art so long as the modified cytochromeP450 santalene oxidase polypeptide retains cytochrome P450 santaleneoxidase activity.

Also provided herein are nucleic acid molecules that encode any of themodified cytochrome P450 santalene oxidase polypeptides provided herein.In particular examples, the nucleic acid sequence can be codonoptimized, for example, to increase expression levels of the encodedsequence. The particular codon usage is dependent on the host organismin which the modified polypeptide is expressed. One of skill in the artis familiar with optimal codons for expression in bacteria or yeast,including for example E. coli or Saccharomyces cerevisiae. For example,codon usage information is available from the Codon Usage Databaseavailable at kazusa.or.jp.codon (see Richmond (2000) Genome Biology,1:241 for a description of the database). See also, Forsburg (2004)Yeast, 10:1045-1047; Brown et al. (1991) Nucleic Acids Research,19:4298; Sharp et al. (1988) Nucleic Acids Research, 12:8207-8211; Sharpet al. (1991) Yeast, 657-78. In examples herein, nucleic acid sequencesprovided herein are codon optimized based on codon usage inSaccharomyces cerevisiae.

The modified polypeptides and encoding nucleic acid molecules providedherein can be produced by standard recombinant DNA techniques known toone of skill in the art. Any method known in the art to effect mutationof any one or more amino acids in a target protein can be employed.Methods include standard site-directed or random mutagenesis of encodingnucleic acid molecules, or solid phase polypeptide synthesis methods.For example, as described herein, nucleic acid molecules encoding acytochrome P450 santalene oxidase polypeptide can be subjected tomutagenesis, such as random mutagenesis of the encoding nucleic acid, byerror-prone PCR, site-directed mutagenesis, overlap PCR, gene shuffling,or other recombinant methods. The nucleic acid encoding the polypeptidesthen can be introduced into a host cell to be expressed heterologously.Hence, also provided herein are nucleic acid molecules encoding any ofthe modified polypeptides provided herein. In some examples, themodified cytochrome P450 santalene oxidase polypeptides are producedsynthetically, such as using solid phase or solutions phase peptidesynthesis.

2. Cytochrome P450 Bergamotene Oxidase Polypeptides

Provided herein are cytochrome P450 bergamotene oxidase polypeptides.Also provided herein are nucleic acid molecules that encode any of thecytochrome P450 bergamotene oxidase polypeptides provided herein. Thecytochrome P450 bergamotene oxidase polypeptides provided hereincatalyze the formation of bergamotols from bergamotenes. Typically thecytochrome P450 bergamotene oxidase polypeptides catalyze the formationof (Z)-α-trans-bergamotol and (E)-α-trans-bergamotol fromα-trans-bergamotene (e.g. the cytochrome P450 bergamotene oxidasepolypeptides catalyze the hydroxylation of α-trans-bergamotene). Inparticular examples, the cytochrome P450 bergamotene oxidasepolypeptides catalyze the formation of (E)-α-trans-bergamotol fromα-trans-bergamotene. In some examples, the cytochrome P450 bergamoteneoxidase polypeptides additionally catalyze the formation of minoramounts of (E)-α-santalol and (E)-β-santalol. The cytochrome P450bergamotene oxidase polypeptides exhibit narrow substrate specificity,preferring α-santalene or β-santalene. In some examples, the cytochromeP450 bergamotene oxidase polypeptides also converted the substratetrans-nerolidol.

For example, provided herein are cytochrome P450 bergamotene oxidasepolypeptides that have a sequence of amino acids set forth in any of SEQID NOS:6, 8, 9 and 73. Also provided herein are cytochrome P450bergamotene oxidase polypeptides that exhibit at least 60% amino acidsequence identity to a cytochrome P450 bergamotene oxidase polypeptidehaving a sequence of amino acids set forth in any of SEQ ID NOS:6, 8, 9and 73. For example, the cytochrome P450 bergamotene oxidasepolypeptides provided herein can exhibit at least at or about or 65%,70%, 75%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%,92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% or more amino acid sequenceidentity to a cytochrome P450 bergamotene oxidase polypeptide set forthin any of SEQ ID NOS:6, 8, 9 and 73, provided the cytochrome P450bergamotene oxidase polypeptides exhibit cytochrome P450 bergamoteneoxidase activity (i.e. catalyze the formation of bergamotols frombergamotenes). Percent identity can be determined by one skilled in theart using standard alignment programs.

Provided herein are cytochrome P450 bergamotene oxidases designatedSaCYP76F38v1 (CYP76-G5), SaCYP76F37v1 (CYP76-G11), SaCYP76F38v2(CYP76-G11) and SaCYP76F37v2 (CYP76-G14), that have a sequence of aminoacids set forth in SEQ ID NOS: 6, 8, 9 and 73, respectively. Alsoprovided herein are active fragments of cytochrome P450 bergamoteneoxidase polypeptides having a sequence of amino acids set forth in anyof SEQ ID NOS: 6, 8, 9 and 73. Such fragments retain one or moreproperties of a cytochrome P450 bergamotene oxidase polypeptide.Typically, the active fragments exhibit cytochrome P450 bergamoteneoxidase activity (i.e. the ability to catalyze the hydroxylation ofbergamotenes from bergamotols).

In particular examples, the cytochrome P450 bergamotene oxidasesprovided herein having a sequence of amino acids set forth in SEQ IDNOS: 6, 8, 9 and 73 catalyzed the formation of (E)-α-trans-bergamotol,(E)-α-santalol and (E)-β-santalol in in vitro assays with a mixture ofα-santalene, α-trans-bergamotene, epi-β-santalene and β-santalene as thesubstrate. In such examples, (E)-α-trans-bergamotol was the majorproduct, and (E)-α-santalol and (E)-β-santalol were minor products (seeExample 11.B.3.b and FIGS. 17A-17D). In another example, in in vitroassays with either α-santalene, α-trans-bergamotene, or epi-β-santaleneand β-santalene, the cytochrome P450 bergamotene oxidase provided hereinhaving a sequence of amino acids set forth in SEQ ID NO:8 catalyzed theformation of (E)-α-santalol, (E)-α-trans-bergamotol or (E)-β-santalol,respectively (see Example 11.0 and FIGS. 20D-20F). In yet otherexamples, the cytochrome P450 bergamotene oxidases provided hereinhaving a sequence of amino acids set forth in SEQ ID NOS: 6, 8, 9 and 73catalyzed the formation of (E)-α-trans-bergamotol in in vivo assays inyeast that express santalene synthase (see Example 10.C.2 and FIGS.14A-14D). The kinetic properties of the SaCYP76F37v1 (SaCYP76-G11)polypeptide for α- and β-santalene as substrates are described inExample 12 below.

Also provided herein are nucleic acid molecules that have a sequence ofamino acids set forth in any of SEQ ID NOS:2, 4, 5 and 67, ordegenerates thereof, that encode a cytochrome P450 bergamotene oxidasepolypeptide having a sequence of amino acids set forth in SEQ ID NOS:6,8, 9 and 73, respectively. Also provided herein are nucleic acidmolecules encoding a cytochrome P450 bergamotene oxidase polypeptidehaving at least 85% sequence identity to a sequence of nucleotides setforth in any of SEQ ID NOS:2, 4, 5 and 67. For example, the nucleic acidmolecules provided herein can exhibit at least or about at least 85%,86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 95%, 95%, 96%, 97%, 98% or 99%or more sequence identity to a sequence of nucleotides set forth in anyof SEQ ID NOS:2, 4, 5 and 67, so long as the encoded cytochrome P450bergamotene oxidase polypeptide exhibits cytochrome P450 bergamoteneoxidase activity (i.e. the ability to catalyze the formation ofbergamotols from bergamotene). Also provided herein are degeneratesequences of the sequence set forth in any of SEQ ID NOS:2, 4, 5 and 67encoding a cytochrome P450 bergamotene oxidase polypeptide having asequence of amino acids set forth in SEQ ID NO:6, 8, 9 and 73,respectively. Percent identity can be determined by one skilled in theart using standard alignment programs.

In some examples, the nucleic acid molecules that encode the cytochromeP450 bergamotene oxidase polypeptides are isolated from the sandalwoodtree Santalum album. In other examples, the nucleic acid molecules andencoded cytochrome P450 bergamotene oxidase polypeptides are variants ofthose isolated from the sandalwood tree Santalum album.

Modified Cytochrome P450 Bergamotene Oxidase Polypeptides

Provided herein are modified cytochrome P450 bergamotene oxidasepolypeptides. The modifications, which typically are amino acidinsertions, deletions and/or substitutions, can be effected in anyregion of a cytochrome P450 bergamotene oxidase polypeptide provided theresulting modified cytochrome P450 bergamotene oxidase polypeptides atleast retain cytochrome P450 bergamotene oxidase activity. For example,modifications can be made in any region of a cytochrome P450 bergamoteneoxidase provided the resulting modified cytochrome P450 bergamoteneoxidase at least retains cytochrome P450 bergamotene oxidase activity(i.e., the ability to catalyze the formation of a bergamotol from abergamotene).

The modifications can be a single amino acid modification, such assingle amino acid replacements (substitutions), insertions or deletions,or multiple amino acid modifications, such as multiple amino acidreplacements, insertions or deletions. In some examples, entire orpartial domains or regions, such as any domain or region describedherein below, are exchanged with corresponding domains or regions orportions thereof from another cytochrome P450 bergamotene oxidasepolypeptide. Exemplary of modifications are amino acid replacements,including single or multiple amino acid replacements. For example,modified cytochrome P450 bergamotene oxidase polypeptides providedherein can contain at least or 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30,31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48,49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66,67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84,85, 90, 95, 100, 105, 110, 115, 120 or more modified positions comparedto the cytochrome P450 polypeptide not containing the modification. Forexample, the modifications described herein can be in a cytochrome P450bergamotene oxidase polypeptide having a sequence of amino acids setforth in any of SEQ ID NOS:6, 8, 9 or 73 or any variant thereof,including any that have at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%,92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to acytochrome P450 bergamotene oxidase polypeptide set forth in any of SEQID NOS:6, 8, 9 and 73. Based on this description, it is within the levelof one of skill in the art to generate a cytochrome P450 bergamoteneoxidase polypeptide containing any one or more of the describedmutations, and test each for cytochrome P450 bergamotene oxidaseactivity described herein.

Also, in some examples, provided herein are modified active fragments ofcytochrome P450 bergamotene oxidase polypeptides that contain any of themodifications provided herein. Such fragments retain on or moreproperties of a cytochrome P450 bergamotene oxidase. Typically, themodified cytochrome P450 bergamotene oxidase polypeptides exhibitbergamotene oxidase activity (i.e., the ability to hydrolyzebergamotene).

Modifications in a cytochrome P450 bergamotene oxidase polypeptide thatalso contains other modifications, including modifications of theprimary sequence and modifications not in the primary sequence of thepolypeptide. For example, modification described herein can be in acytochrome P450 bergamotene oxidase polypeptide that is a fusionpolypeptide or chimeric polypeptide, including hybrids of differentcytochrome P450 bergamotene oxidase polypeptides with differentcytochrome P450 polypeptides (e.g. contain one or more domains orregions from another cytochrome P450s) and also synthetic cytochromeP450 bergamotene oxidase polypeptides prepared recombinantly orsynthesized or constructed by other methods known in the art based uponthe sequence of known polypeptides.

In some examples, the modifications are amino acid replacements. Infurther examples, the modified cytochrome P450 bergamotene oxidasepolypeptides provided herein contain one or more modifications in adomain. As described elsewhere herein, the modifications in a domain orstructural domain can be by replacement of corresponding heterologousresidues from another cytochrome P450 polypeptide.

To retain cytochrome P450 bergamotene oxidase activity, modificationstypically are not made at those positions necessary for cytochrome P450activity, i.e., in the catalytic center or in conserved residues. Forexample, generally modifications are not made a position correspondingto Glu367, Arg370, Gly445, Arg446, Arg447, Ile448, Cys449, Pro450 orGly451 with reference to a sequence of amino acids set forth in SEQ IDNO:6, 8, 9 or 73.

The modified cytochrome P450 bergamotene oxidase polypeptides cancontain two or more modifications, including amino acid replacements orsubstitutions, insertions or deletions, truncations or combinationsthereof. Generally, multiple modifications provided herein can becombined by one of skill in the art so long as the modified cytochromeP450 bergamotene oxidase polypeptide retains cytochrome P450 bergamoteneoxidase activity.

Also provided herein are nucleic acid molecules that encode any of themodified cytochrome P450 bergamotene oxidase polypeptides providedherein. In particular examples, the nucleic acid sequence can be codonoptimized, for example, to increase expression levels of the encodedsequence. The particular codon usage is dependent on the host organismin which the modified polypeptide is expressed. One of skill in the artis familiar with optimal codons for expression in bacteria or yeast,including for example E. coli or Saccharomyces cerevisiae. For example,codon usage information is available from the Codon Usage Databaseavailable at kazusa.or.jp.codon (see Richmond (2000) Genome Biology,1:241 for a description of the database). See also, Forsburg (2004)Yeast, 10:1045-1047; Brown et al. (1991) Nucleic Acids Research,19:4298; Sharp et al. (1988) Nucleic Acids Research, 12:8207-8211; Sharpet al. (1991) Yeast, 657-78. In examples herein, nucleic acid sequencesprovided herein are codon optimized based on codon usage inSaccharomyces cerevisiae.

The modified polypeptides and encoding nucleic acid molecules providedherein can be produced by standard recombinant DNA techniques known toone of skill in the art. Any method known in the art to effect mutationof any one or more amino acids in a target protein can be employed.Methods include standard site-directed or random mutagenesis of encodingnucleic acid molecules, or solid phase polypeptide synthesis methods.For example, as described herein, nucleic acid molecules encoding acytochrome P450 bergamotene oxidase polypeptide can be subjected tomutagenesis, such as random mutagenesis of the encoding nucleic acid, byerror-prone PCR, site-directed mutagenesis, overlap PCR, gene shuffling,or other recombinant methods. The nucleic acid encoding the polypeptidesthen can be introduced into a host cell to be expressed heterologously.Hence, also provided herein are nucleic acid molecules encoding any ofthe modified polypeptides provided herein. In some examples, themodified cytochrome P450 bergamotene oxidase polypeptides are producedsynthetically, such as using solid phase or solutions phase peptidesynthesis.

3. Additional Modifications

Provided herein are cytochrome P450 polypeptides, including cytochromeP450 santalene oxidase and cytochrome P450 bergamotene oxidasepolypeptides, that contain additional modifications. For example,modified cytochrome P450 polypeptides include, for example, truncatedcytochrome P450 polypeptides, cytochrome P450 polypeptides havingaltered activities or properties, chimeric cytochrome P450 polypeptides,cytochrome P450 polypeptides containing domain swaps, cytochrome P450fusion proteins, or cytochrome P450 polypeptides having any modificationdescribed elsewhere herein.

a. Truncated Polypeptides

Also provided herein are truncated cytochrome P450 polypeptides. Thetruncated cytochrome P450 polypeptides can be truncated at theN-terminus or C-terminus, so long as the truncated cytochrome P450polypeptides retain the catalytic activity of a cytochrome P450, such ascytochrome P450 santalene oxidase or cytochrome P450 bergamotene oxidaseactivity. Typically, the truncated cytochrome P450 santalene oxidasepolypeptides exhibit santalene oxidase activity (i.e., the ability tocatalyze the hydroxylation of a santalene, namely α-santalene,β-santalene or epi-β-santalene). Typically, the truncated cytochromeP450 bergamotene oxidase polypeptides exhibit bergamotene oxidaseactivity (i.e., the ability to catalyze the hydroxylation of abergamotene). In some examples, the cytochrome P450 polypeptides,including the cytochrome P450 santalene oxidase and cytochrome P450bergamotene oxidase polypeptides, are truncated at the C-terminus. Inother examples, the cytochrome P450 polypeptides, including thecytochrome P450 santalene oxidase and cytochrome P450 bergamoteneoxidase polypeptides, are truncated at the N-terminus.

In some examples, the cytochrome P450 polypeptides, including thecytochrome P450 santalene oxidase and cytochrome P450 bergamoteneoxidase polypeptides, are truncated at the N-terminus, C-terminus orboth termini of a cytochrome P450 polypeptide provided herein, such astruncation of a sequence of amino acids set forth in any of SEQ IDNOS:6-9. In other examples, any of the modified cytochrome P450polypeptides provided herein are truncated. The modified cytochrome P450polypeptides can be truncated at their N-terminus, C-terminus, or bothtermini. For example, any cytochrome P450 polypeptide provided hereincan be truncated by at or about or at least 1, 2, 3, 4, 5, 6, 7, 8, 9,10, 11, 12, 13, 14, 15, 16, 17, 18, 19 20, 21, 22, 23, 24, 25, 26, 27,28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45,46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63,64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75 or more amino acidresidues at the N-terminus, provided the cytochrome P450 polypeptideretains cytochrome P450 activity. In other examples, any cytochrome P450polypeptide provided herein can be truncated by at or about or at least1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 20,21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38,39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56,57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74,75 or more amino acid residues at the C-terminus, provided thecytochrome P450 polypeptide retains cytochrome P450 activity.

b. Polypeptides with Altered Activities or Properties

The modified cytochrome P450 polypeptides provided herein can alsoexhibit changes in activities and/or properties. The modified cytochromeP450 polypeptides can exhibit, for example, improved properties, such asincreased catalytic activity, increased selectivity, increased substratespecificity, increased substrate binding, increased stability, and/orincreased expression in a host cell, and altered properties, such asaltered product distribution and altered substrate specificity. Suchimproved or altered activities can result in increased production ofsantalols and/or bergamotols.

In some examples, the modified cytochrome P450 polypeptides have alteredsubstrate specificity. For example, the substrate specificity of amodified cytochrome P450 polypeptide can be altered by at least or atleast about 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100% ormore compared to the unmodified cytochrome P450 polypeptide. Forexample, a modified cytochrome P450 santalene oxidase or cytochrome P450bergamotene oxidase polypeptide can catalyze the monooxygenation of aterpene substrate that is not a santalene or bergamotene. In suchexamples, the modified cytochrome P450 polypeptides catalyze theformation of terpenoids other than santalols or bergamotols from anysuitable terpene substrate. For example, the modified cytochrome P450polypeptides can produce one or more different monoterpenoids,sesquiterpenoids or diterpenoids other than santalols and bergamotols.

In some examples, the modified cytochrome P450 polypeptides have analtered terpenoid product distribution. In some examples, alteredproduct distribution results in an increased amount of a desiredterpenoid product, and thus product distribution is improved compared tothe product distribution of the unmodified cytochrome P450. In otherexamples, altered product distribution results in an decreased amount ofa desired terpenoid product, and thus the product distribution of themodified cytochrome P450 is decreased compared to that of the unmodifiedcytochrome P450. In one example, the modified cytochrome P450 santaleneoxidase produces a different ratio of terpenoid products compared to theunmodified cytochrome P450 santalene oxidase. For example, the amount ofa terpenoid produced by the modified cytochrome P450 can be increased ordecreased by at least or at least about or 0.5%, 1%, 2%, 3%, 4%, 5%, 6%,7%, 8%, 9%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 50%, 60%, 70%, 80% ormore compared to the amount of a different terpenoid produced by theunmodified cytochrome P450. For example, the amount of a terpenoidproduced by the modified cytochrome P450 santalene oxidase, such as, forexample, a β-santalol, can be increased by at least or at least about0.5%, 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 15%, 20%, 25%, 30%, 35%,40%, 50%, 60%, 70%, 80% or more compared to the amount of a differentterpenoid produced by the unmodified cytochrome P450 santalene oxidase,such as, for example, an α-santalol. In some examples, the modifiedcytochrome P450 santalene oxidases produce more β-santalol than anyother terpenoid compound. In another example, the modified cytochromeP450 bergamotene oxidase produces a different ratio of terpenoidproducts compared to the unmodified cytochrome P450 bergamotene oxidase.For example, the amount of a terpenoid produced by the modifiedcytochrome P450 bergamotene oxidase, such as, for example, aα-trans-bergamotol, can be increased by at least or at least about 0.5%,1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 15%, 20%, 25%, 30%, 35%, 40%,50%, 60%, 70%, 80% or more compared to the amount of a differentterpenoid produced by the unmodified cytochrome P450 bergamoteneoxidase.

In some examples, the modified cytochrome P450 polypeptide exhibits asimilar, increased and/or improved activity compared to the unmodifiedcytochrome P450 polypeptide. For example, a modified cytochrome P450polypeptide exhibits increased terpenoid production compared to anunmodified cytochrome P450 polypeptide. The increased terpenoidproduction can be an increase in the total amount of terpenoids producedby the modified cytochrome P450 polypeptide or can be an increase in theamount of a particular terpenoid produced by the modified cytochromeP450 polypeptide. For example, the total terpenoid production of amodified cytochrome P450 polypeptide can be increased by at least or atleast about 1%, 3%, 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%,90%, 100% or more compared to an unmodified cytochrome P450 polypeptide.In some examples, the total terpenoid production of a modifiedcytochrome P450 polypeptide is at least or about 1.2-fold, 1.5-fold,2-fold, 3-fold, 4-fold, 5-fold, 6-fold, 7-fold, 8-fold, 9-fold, 10-fold,11-fold, 12-fold, 13-fold, 14-fold, 15-fold, 16-fold, 17-fold, 18-fold,19-fold, 20-fold or more compared to an unmodified cytochrome P450polypeptide. In another example, the production of a particularterpenoid by a modified cytochrome P450 polypeptide is increased by atleast or at least about 1%, 3%, 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%,70%, 80%, 90%, 100% or more compared to an unmodified cytochrome P450polypeptide. In some examples, a modified cytochrome P450 polypeptideproduces at least or about 1.2-fold, 1.5-fold, 2-fold, 3-fold, 4-fold,5-fold, 6-fold, 7-fold, 8-fold, 9-fold, 10-fold, 11-fold, 12-fold,13-fold, 14-fold, 15-fold, 16-fold, 17-fold, 18-fold, 19-fold, 20-foldor more of a particular terpenoid product compared to the unmodifiedcytochrome P450 polypeptide.

In some examples, the modified cytochrome P450 polypeptide exhibitsimproved substrate specificity compared to the unmodified cytochromeP450 polypeptide. Substrate specificity of the modified cytochrome P450polypeptide can be increased by at least or at least about 1%, 5%, 10%,15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100% or more compared tothe substrate specificity of the unmodified cytochrome P450 polypeptide.For example, the modified cytochrome P450 polypeptide can exhibitincreased substrate specificity for a terpene, such as a santalene,compared to a different terpene, such as a bergamotene. In suchexamples, increased specificity for a santalene results in increasedproduction of santalols and decreased production of a bergamotol.

In some examples, the modified cytochrome P450 polypeptide, such as amodified cytochrome P450 santalene oxidase polypeptide, exhibits similaror increased or improved santalene oxidase activity compared to theunmodified cytochrome P450 santalene oxidase polypeptide. For example,the modified cytochrome P450 santalene oxidase polypeptide can exhibitincreased specificity or selectivity for oxidation of α-santalene,β-santalene and/or epi-β-santalene compared to the unmodified cytochromeP450 santalene oxidase polypeptide. In some instances of such examples,the modified cytochrome P450 santalene oxidase selectivelymonooxygenates β-santalene compared to the unmodified cytochrome P450santalene oxidase. In other examples, the modified cytochrome P450santalene oxidase polypeptide exhibits reduced selectivity for oxidationof bergamotene compared to the unmodified cytochrome P450 santaleneoxidase. For example, the modified cytochrome P450 santalene oxidaseexhibits a decrease in activity towards oxidation of bergamotene of atleast or at least about 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%,90%, 100% or more compared to the unmodified cytochrome P450 santaleneoxidase.

In some examples, the modified cytochrome P450 polypeptide, such as amodified cytochrome P450 bergamotene oxidase polypeptide, exhibitssimilar or increased or improved bergamotene oxidase activity comparedto the unmodified cytochrome P450 bergamotene oxidase polypeptide. Forexample, the modified cytochrome P450 bergamotene oxidase polypeptidecan exhibit increased specificity or selectivity for oxidation ofα-trans-bergamotene compared to the unmodified cytochrome P450bergamotene oxidase.

c. Domain Swaps

Provided herein are modified cytochrome P450 polypeptides that arechimeric polypeptides containing a swap (deletion and insertion) bydeletion of amino acid residues of one of more domains or regionstherein or portions thereof and insertion of a heterologous sequence ofamino acids. In some examples, the heterologous sequence is a randomizedsequence of amino acids. In other examples, the heterologous sequence isa contiguous sequence of amino acids for the corresponding domain orregion or portion thereof from another cytochrome P450. The heterologoussequence that is replaced or inserted generally includes at least 3, 4,5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23,24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, ormore amino acids. In examples where the heterologous sequence is from acorresponding domain or a portion thereof of another cytochrome P450,the heterologous sequence generally includes at least 50%, 60%, 70%,80%, 90%, 95% or more contiguous amino acids of the corresponding domainor region or portion. In such an example, adjacent residues to theheterologous corresponding domain or region or portion thereof also canbe included in a modified cytochrome P450 polypeptide provided herein.

In one example of swap mutants provided herein, at least one domain orregion or portion thereof of a cytochrome P450 polypeptide is replacedwith a contiguous sequence of amino acids for the corresponding domainor region or portions thereof from another cytochrome P450 polypeptide.In some examples, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more domains or regionsor portions thereof are replaced with a contiguous sequence of aminoacids for the corresponding domain or region or portions thereof fromanother cytochrome P450 polypeptide.

Any domain or region or portion thereof of a cytochrome P450 polypeptidecan be replaced with a heterologous sequence of amino acids, such asheterologous sequence from the corresponding domain or region fromanother cytochrome P450. A domain or region can be a structural domainor a functional domain. One of skill in the art is familiar with domainsor regions in cytochrome P450s. Functional domains include, for example,the catalytic domain or a portion thereof. A structural domain caninclude all or a portion of helix A, β strand 1-1, β strand 1-2, helixB, β strand 1-5, helix B′, helix C, helix C′, helix D, β strand 3-1,helix E, helix F, helix G, helix H, β strand 5-1, β strand 5-2, helix I,helix J, helix J′, helix K, β strand 1-4, β strand 2-1, β strand 2-2, βstrand 1-3, helix K′, helix K″, Heme domain, helix L, β strand 3-3, βstrand 4-1, β strand 4-2 and β strand 3-2. One of skill in the art isfamiliar with various cytochrome P450s and can identify correspondingdomains or regions or portions of amino acids thereof.

Typically, the resulting modified cytochrome P450 polypeptides exhibitcytochrome P450 monooxygenase activity and the ability to producesantalols and/or bergamotols from santalenes and bergamotenes. Forexample, the modified cytochrome P450 santalene oxidase polypeptidesexhibit 50% to 5000%, such as 50% to 120%, 100% to 500% or 110% to 250%of the santalol production from santalene compared to the cytochromeP450 santalene oxidase not containing the modification (e.g. the aminoacid replacement or swap of amino acid residues of a domain or region)and/or compared to wild type cytochrome P450 santalene oxidase set forthin SEQ ID NO:7, 74, 75, 76 or 77. Typically, the modified cytochromeP450 santalene oxidase polypeptides exhibit increased santalolproduction from santalene compared to the cytochrome P450 santaleneoxidase not containing the modification, such as compared to thecytochrome P450 santalene oxidase set forth in SEQ ID NO:7, 74, 75, 76or 77. For example, the modified cytochrome P450 santalene oxidasepolypeptides can produce santalols from santalenes in an amount that isat least or about 101%, 102%, 103%, 104%, 105%, 106%, 107%, 108%, 109%,110%, 115%, 120%, 125%, 130%, 135%, 140%, 145%, 150%, 160%, 170%, 180%,200%, 250%, 300%, 350%, 400%, 500%, 1500%, 2000%, 3000%, 4000%, 5000% ofthe amount of santalols produced from santalenes by wild type cytochromeP450 santalene oxidase synthase not containing the modification underthe same conditions. For example, the santalol production is increasedat least 1.2-fold, 1.5-fold, 2-fold, 3-fold, 4-fold, 5-fold, 6-fold,7-fold, 8-fold, 9-fold, 10-fold, 11-fold, 12-fold, 13-fold, 14-fold,15-fold, 16-fold, 17-fold, 18-fold, 19-fold, 20-fold or more.

In another example, the modified cytochrome P450 bergamotene oxidasepolypeptides exhibit 50% to 5000%, such as 50% to 120%, 100% to 500% or110% to 250% of the bergamotol production from bergamotene compared tothe cytochrome P450 bergamotene oxidase not containing the modification(e.g. the amino acid replacement or swap of amino acid residues of adomain or region) and/or compared to wild type cytochrome P450bergamotene oxidase set forth in SEQ ID NO:6, 8, 9 or 73. Typically, themodified cytochrome P450 bergamotene oxidase polypeptides exhibitincreased bergamotol production from bergamotene compared to thecytochrome P450 bergamotene oxidase not containing the modification,such as compared to the cytochrome P450 bergamotene oxidase set forth inSEQ ID NO:6, 8, 9 or 73. For example, the modified cytochrome P450bergamotene oxidase polypeptides can produce bergamotol from bergamotenein an amount that is at least or about 101%, 102%, 103%, 104%, 105%,106%, 107%, 108%, 109%, 110%, 115%, 120%, 125%, 130%, 135%, 140%, 145%,150%, 160%, 170%, 180%, 200%, 250%, 300%, 350%, 400%, 500%, 1500%,2000%, 3000%, 4000%, 5000% of the amount of bergamotol produced frombergamotene by wild type cytochrome P450 bergamotene oxidase synthasenot containing the modification under the same conditions. For example,the bergamotol production is increased at least 1.2-fold, 1.5-fold,2-fold, 3-fold, 4-fold, 5-fold, 6-fold, 7-fold, 8-fold, 9-fold, 10-fold,11-fold, 12-fold, 13-fold, 14-fold, 15-fold, 16-fold, 17-fold, 18-fold,19-fold, 20-fold or more.

In particular examples herein, modified cytochrome P450 polypeptidesprovided herein are swap mutants whereby all or a portion of one or morestructural domains is replaced with a corresponding structural domain ofanother cytochrome P450 polypeptide. Table 3 below identifies structuraldomains of cytochrome P450 santalene oxidase (SEQ ID NO:7) andcytochrome P450 bergamotene oxidase (SEQ ID NO:6) based on alignment ofthe cytochrome P450 polypeptides with cytochrome P450BM-3, a class IImicrosomal P450 (SEQ ID NO:66; Accession No. 2HPD; Ravichandran et al.(1993) Science 261:731-736; see also FIGS. 5A-5B). Hence, thecorresponding domain can be identified in other cytochrome P450polypeptides.

TABLE 3 Structural Domains santalene oxidase bergamotene oxidasestructure (SEQ ID NO: 7) (SEQ ID NO: 6) helix A 54-65 54-65 β strand 1-167-74 67-74 β strand 1-2 75-82 75-82 helix B 83-91 83-91 β strand 1-595-98 95-98 helix B′ 101-108 101-108 helix C 124-133 124-133 helix D149-164 149-164 β strand 3-1 170-173 170-173 helix E 174-189 174-189helix F 204-218 204-218 helix G 238-265 238-265 helix H 278-285 278-285β strand 5-1 287-290 287-290 β strand 5-2 291-294 291-294 helix I297-329 297-329 helix J 330-343 330-343 helix J′ 351-358 351-358 helix K359-371 359-371 β strand 1-4 376-382 376-382 β strand 2-1 383-389383-389 β strand 2-2 391-397 391-397 β strand 1-3 398-402 398-402 helixK′ 403-410 403-410 Heme domain 444-451 444-451 helix L 452-469 452-469 βstrand 3-3 470-474 470-474 β strand 4-1 481-485 481-485 β strand 4-2487-491 487-491 β strand 3-2 493-500 493-500

Any methods known in the art for generating chimeric polypeptides can beused to replace all or a contiguous portion of a domain or a cytochromeP450 with all or a contiguous portion of the corresponding domain of asecond cytochrome P450 (see, U.S. Pat. Nos. 5,824,774, 6,072,045,7,186,891 and 8,106,260, and U.S. Pat. Pub. No. 20110081703). Also, geneshuffling methods can be employed to generate chimeric polypeptidesand/or polypeptides with domain or region swaps.

For example, corresponding domains or regions of any two cytochromeP450s can be exchanged using any suitable recombinant method known inthe art, or by in vitro synthesis. Exemplary of recombinant methods is atwo stage overlapping PCR method, such as described herein. In suchmethods, primers that introduce mutations at a plurality of codonpositions in the nucleic acids encoding the targeted domain or portionthereof in the first cytochrome P450 can be employed. The mutationstogether form the heterologous region (i.e. the corresponding regionfrom the second cytochrome P450). Alternatively, for example, randomizedamino acids can be used to replace particular domains or regions. It isunderstood that primer errors, PCR errors and/or other errors in thecloning or recombinant methods can result in errors such that theresulting swapped or replaced region or domain does not exhibit an aminoacid sequence that is identical to the corresponding region from thesecond cytochrome P450 reductase.

In an exemplary PCR-based method, the first stage PCR uses (i) adownstream primer that anneals downstream of the region that is beingreplaced with a mutagenic primer that includes approximately fifteennucleotides (or an effective number to effect annealing, such as 5, 6,7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 20, 25 nucleotides or more) ofhomologous sequence on each side of the domain or region to be exchangedor randomized flanking the region to be imported into the target gene,and (ii) an upstream primer that anneals upstream of the region that isbeing replaced together with an opposite strand mutagenic primer thatalso includes approximately fifteen nucleotides (or an effective numberto effect annealing, such as 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,17, 20, 25 nucleotides or more) of homologous sequence on each side ofthe domain or region to be exchanged or randomized flanking the regionto be imported into the target gene. If a replacement in which a domainor region of a first cytochrome P450 gene is replaced with thecorresponding domain or region from a second cytochrome P450 is beingperformed, nucleotides in the mutagenic primers between the flankingregions from the first cytochrome P450 contain codons for thecorresponding region of the second cytochrome P450. In instances wherethe amino acids in a domain or region are to be randomized, nucleotidesof the mutagenic primers between the flanking regions from the firstcytochrome P450 contains random nucleotides. An overlapping PCR is thenperformed to join the two fragments, using the upstream and downstreamoligo. The resulting PCR product then can be cloned into any suitablevector for expression of the modified cytochrome P450.

Further, any of the modified cytochrome P450 polypeptides containingswap mutations herein can contain one or more further amino acidreplacements as described herein above.

d. Additional Variants

Cytochrome P450 polypeptides provided herein can be modified by anymethod known to one of skill in the art for generating protein variants,including, but not limited to, DNA or gene shuffling, error prone PCR,overlap PCR or other recombinant methods. In one example, nucleic acidmolecules encoding any cytochrome P450 polypeptide or variant cytochromeP450 polypeptide provided herein can be modified by gene shuffling. Geneshuffling involves one or more cycles of random fragmentation andreassembly of at least two nucleotide sequences, followed by screeningto select nucleotide sequences encoding polypeptides with desiredproperties. The recombination can be performed in vitro (see Stemmer etal. (1994) Proc Natl Acad Sci USA 91:10747-10751; Stemmer et al. (1994)Nature 370:389-391; Cramieri et al. (1998) Nature 391:288-291; U.S. Pat.Nos. 5,605,793, 5,811,238, 5,830,721, 5,834,252 and 5,837,458) or invivo (see, International Pat. Pub. No. WO199707205). The nucleic acidmolecules encoding the polypeptides then can be introduced into a hostcell to be expressed heterologously and tested for their cytochrome P450activity by any method described in section G below.

e. Fusion or Chimeric Proteins

Nucleic acid molecules provided herein include fusion or chimericnucleic acid molecules that contain a santalene synthase and acytochrome P450 polypeptide. For example, provided herein are nucleicacid molecules encoding a fusion polypeptide that is capable ofcatalyzing the formation of a santalol, such as an α-santalol,β-santalol or epi-β-santalol, from FPP that contains any santalenesynthase and cytochrome P450 santalene oxidase polypeptide providedherein. For example, provided herein are nucleic acid molecules encodinga fusion polypeptide that contains a santalene synthase set forth in anyof SEQ ID NOS:17, 52 or 53 and a cytochrome P450 santalene oxidasepolypeptide set forth in SEQ ID NO:7, 74, 75, 76 or 77. Also providedherein are fusion polypeptides containing a santalene synthase set forthin any of SEQ ID NOS: 17, 52 or 53 and a cytochrome P450 santaleneoxidase polypeptide set forth in SEQ ID NO:7, 74, 75, 76 or 77. Alsoprovided herein are nucleic acid molecules encoding a fusion polypeptidethat is capable of catalyzing the formation of a bergamotol, such as anα-trans-bergamotol, from FPP that contains any santalene synthase andcytochrome P450 santalene bergamotene polypeptide provided herein. Forexample, provided herein are nucleic acid molecules encoding a fusionpolypeptide that contains a santalene synthase set forth in any of SEQID NOS: 17, 52 or 53 and a cytochrome P450 bergamotene oxidasepolypeptide set forth in any of SEQ ID NOS:6, 8, 9 or 73. Also providedherein are fusion polypeptides containing a santalene synthase set forthin any of SEQ ID NOS:17, 52 or 53 and a cytochrome P450 bergamoteneoxidase polypeptide set forth in any of SEQ ID NOS:6, 8, 9 or 73. Thefusion polypeptides can be linked directly or via a linker.

Nucleic acid molecules provided herein include fusion or chimericnucleic acid molecules that contain a cytochrome P450 polypeptide and acytochrome P450 reductase. For example, provided herein are nucleic acidmolecules encoding a fusion polypeptide that contains a cytochrome P450santalene oxidase polypeptide set forth in any of SEQ ID NOS:7, 74, 75,76 or 77 and a cytochrome P450 reductase set forth in any of SEQ IDNOS:12-15. Also provided herein are fusion polypeptides containing acytochrome P450 santalene oxidase polypeptide set forth in any of SEQ IDNOS:7, 74, 75, 76 or 77 and a cytochrome P450 reductase set forth in anyof SEQ ID NOS:12-15. In another example, provided herein are nucleicacid molecules encoding a fusion polypeptide that contains a cytochromeP450 bergamotene oxidase polypeptide set forth in any of SEQ ID NOS:6,8, 9 or 73 and a cytochrome P450 reductase set forth in any of SEQ IDNOS:12-15. Also provided herein are fusion polypeptides containing acytochrome P450 bergamotene oxidase polypeptide set forth in any of SEQID NOS:6, 8, 9 or 73 and a cytochrome P450 reductase set forth in any ofSEQ ID NOS:12-15. The fusion polypeptides can be linked directly or viaa linker.

Nucleic acid molecules provided herein include fusion or chimericnucleic acid molecules that contain a santalene synthase, cytochromeP450 polypeptide and a cytochrome P450 reductase. For example, providedherein are nucleic acid molecules encoding a fusion polypeptide thatcontains a santalene synthase set forth in any of SEQ ID NOS:17, 52 or53, a cytochrome P450 santalene oxidase polypeptide set forth in any ofSEQ ID NOS:7, 74, 75, 76 or 77 and a cytochrome P450 reductase set forthin any of SEQ ID NOS:12-15. Also provided herein are fusion polypeptidescontaining a santalene synthase set forth in any of SEQ ID NOS: 17, 52or 53, a cytochrome P450 santalene oxidase polypeptide set forth in anyof SEQ ID NOS:7, 74, 75, 76 or 77 and a cytochrome P450 reductase setforth in any of SEQ ID NOS:12-15. In another example, provided hereinare nucleic acid molecules encoding a fusion polypeptide that contains asantalene synthase set forth in any of SEQ ID NOS:17, 52 or 53, acytochrome P450 bergamotene oxidase polypeptide set forth in any of SEQID NOS:6, 8, 9 or 73 and a cytochrome P450 reductase set forth in any ofSEQ ID NOS:12-15. Also provided herein are fusion polypeptidescontaining a santalene synthase set forth in any of SEQ ID NOS: 17, 52or 53, a cytochrome P450 bergamotene oxidase polypeptide set forth inany of SEQ ID NOS:6, 8, 9 or 73 and a cytochrome P450 reductase setforth in any of SEQ ID NOS:12-15. The fusion polypeptides can be linkeddirectly or via a linker.

In another example, provided herein is a nucleic acid molecule thatencodes a santalene synthase, a cytochrome P450 and/or a cytochrome P450reductase, such that, when expressed in a host cell, a bacterial oryeast host cell, a santalene synthase, a cytochrome P450 and/or acytochrome P450 reductase are expressed. In one example, provided hereinis a nucleic acid molecule that encodes a santalene synthase and acytochrome P450 santalene oxidase. In another example, provided hereinis a nucleic acid molecule that encodes a santalene synthase and acytochrome P450 bergamotene oxidase. In yet another example, providedherein is a nucleic acid molecule that encodes a santalene synthase, acytochrome P450 santalene oxidase and a cytochrome P450 reductase. Inanother example, provided herein is a nucleic acid molecule that encodesa santalene synthase, a cytochrome P450 bergamotene oxidase and acytochrome P450 reductase. Further, when the host cell is capable ofproducing FPP, the encoded polypeptides catalyze the production ofsantalols and/or bergamotols.

Other examples of fusion proteins include, but are not limited to,fusions of a signal sequence, a tag such as for localization, e.g. ahis₆ tag or a myc tag, or a tag for purification, for example, a GSTfusion, GFP fusion or CBP fusion, and a sequence for directing proteinsecretion and/or membrane association.

D. CYTOCHROME P450 REDUCTASE POLYPEPTIDES AND ENCODING NUCLEIC ACIDMOLECULES

Provided herein are cytochrome P450 reductase polypeptides. Alsoprovided herein are nucleic acid molecules that encode any of thecytochrome P450 reductase polypeptides provided herein. The cytochromeP450 reductase polypeptides provided herein transfer two electrons fromNADPH to a cytochrome P450. In some examples, the nucleic acid moleculesthat encode the cytochrome P450 reductase polypeptides are those thatare the same as those that are isolated from the sandalwood treeSantalum album. In other examples, the nucleic acid molecules andencoded cytochrome P450 reductase polypeptides are variants of thoseisolated from the sandalwood tree Santalum album.

Also provided herein are modified cytochrome P450 reductase polypeptidesand nucleic acid molecules that encode any of the modified cytochromeP450 reductase polypeptides provided herein. The modifications can bemade in any region of a cytochrome P450 reductase polypeptide providedthe cytochrome P450 reductase polypeptide at least retains the CPRcatalytic activity of the unmodified cytochrome P450 reductasepolypeptide. For example, modifications can be made to a cytochrome P450reductase polypeptide provided that the cytochrome P450 reductasepolypeptide retains CPR activity (i.e., the ability to transfer twoelectrons from NADPH to a cytochrome P450).

The modifications can include codon optimization of the nucleic acidsand/or changes that result in a single amino acid modification in theencoded polypeptide, such as single amino acid replacement(substitutions), insertions or deletions, or multiple amino acidmodifications, such as multiple amino acid replacements, insertions ordeletions, including swaps of domains or regions of the polypeptide. Insome examples, entire or partial domains or regions, such as any domainor region described herein, are exchanged with corresponding domains orregions or portions thereof from another cytochrome P450 reductasepolypeptide. Exemplary of modifications are amino acid replacements,including single or multiple amino acid replacements. For example,modified cytochrome P450 reductase polypeptides provided herein cancontain at least or 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51,52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69,70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 90, 95,100, 105, 110, 115, 120 or more modified positions compared to thecytochrome P450 reductase polypeptide not containing the modification.

Provided herein are cytochrome P450 reductase polypeptides having asequence of amino acids set forth in SEQ ID NO:12 or 13. Also providedherein are cytochrome P450 reductase polypeptides that exhibit at least60% amino acid sequence identity to a cytochrome P450 reductasepolypeptide set forth in SEQ ID NO:12 or 13. For example, the cytochromeP450 reductase polypeptides provided herein can exhibit at least at orat least about or 65%, 70%, 75%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%,88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% amino acidsequence identity to a cytochrome P450 reductase polypeptide set forthin SEQ ID NO:12 or 13, provided that the resulting cytochrome P450reductase polypeptide at least retains CPR activity (i.e., the abilityto transfer two electrons from NADPH to a cytochrome P450). Percentidentity can be determined by one skilled in the art using standardalignment programs.

Also, in some examples, provided herein are catalytically activefragments of cytochrome P450 reductase polypeptides. In some examples,the active fragments of cytochrome P450 reductase polypeptides aremodified as described above. Such fragments retain one or moreproperties of a full-length cytochrome P450 reductase polypeptide.Typically, the active fragments exhibit CPR activity (i.e., the abilityto transfer two electrons from NADPH to a cytochrome P450).

The cytochrome P450 reductase polypeptides provided herein can containother modifications, for example, modifications not in the primarysequence of the polypeptide, including post-translational modifications.For example, modification described herein can be a cytochrome P450reductase polypeptide that is a fusion polypeptide or chimericpolypeptide, including hybrids of different cytochrome P450 reductasepolypeptides (e.g. contain one or more domains or regions from anothercytochrome P450 reductase polypeptide) and also synthetic cytochromeP450 reductase polypeptides prepared recombinantly or synthesized orconstructed by other methods known in the art based upon the sequence ofknown polypeptides.

The cytochrome P450 reductase polypeptides provided herein can be usedto transfer two electrons from NADPH to a cytochrome P450. Reactions canbe performed in vivo, such as in a host cell into which the nucleic acidhas been introduced. At least one of the polypeptides will beheterologous to the host. Reactions can also be performed in vitro bycontacting with enzyme the appropriate substrate under appropriateconditions.

Also provided herein are nucleic acid molecules encoding a cytochromeP450 polypeptide and a cytochrome P450 reductase polypeptide. Forexample, provided herein are nucleic acid molecules encoding acytochrome P450 santalene oxidase polypeptide and a cytochrome P450reductase polypeptide. In another example, nucleic acid moleculesencoding a cytochrome P450 bergamotene synthase polypeptide and acytochrome P450 reductase polypeptide. Also provided herein are nucleicacid molecules encoding a santalene synthase, cytochrome P450polypeptide and a cytochrome P450 reductase polypeptide. For example,provided herein are nucleic acid molecules encoding a santalenesynthase, cytochrome P450 santalene oxidase polypeptide and a cytochromeP450 reductase polypeptide. In another example, provided herein arenucleic acid molecules encoding a santalene synthase, cytochrome P450bergamotene oxidase polypeptide and a cytochrome P450 reductasepolypeptide. The nucleic acid molecules can be in the same vector orplasmid or on different vectors or plasmids. In such examples,expression of the nucleic acid molecule(s) in a suitable host, forexample, a bacterial or yeast cell, results in expression of cytochromeP450 oxidase and cytochrome P450 reductase, or results in expression ofsantalene synthase, cytochrome P450 oxidase and cytochrome P450reductase, depending on the included nucleic acid molecules. Such cellscan be used to produce the santalene synthases, the cytochrome P450oxidases and the cytochrome P450 reductases and/or to perform reactionsin vivo to produce santalols and bergamotols. For example, santalols andbergamotols can be generated in a host cell from farnesyl diphosphate(FPP), particularly a yeast cell that overproduces the acyclic terpeneprecursor FPP. In some examples, a nucleic acid molecule encoding afarnesyl diphosphate synthase, such as a Santalum album farnesyldiphosphate synthase, can also be expressed in the suitable host, forexample, a bacterial or yeast cell, resulting in over-expression of FPP.

1. Cytochrome P450 Reductase Polypeptides

Provided herein are cytochrome P450 reductase polypeptides. Alsoprovided herein are nucleic acid molecules that encode any of thecytochrome P450 reductase polypeptides provided herein. The cytochromeP450 reductase polypeptides provided herein exhibit CPR activity.Typically, the cytochrome P450 reductase polypeptides provided hereinthe ability to transfer two electrons from NADPH to a cytochrome P450.

For example, provided herein are cytochrome P450 reductase polypeptidesthat have a sequence of amino acids set forth in SEQ ID NO:12 or 13.Also provided herein are cytochrome P450 reductase polypeptides thatexhibit at least 60% amino acid sequence identity to a cytochrome P450reductase polypeptide having a sequence of amino acids set forth in SEQID NO:12 or 13. For example, the cytochrome P450 reductase polypeptidesprovided herein can exhibit at least at or about or 65%, 70%, 75%, 80%,81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%,95%, 96%, 97%, 98% or 99% or more amino acid sequence identity to acytochrome P450 reductase polypeptide set forth in SEQ ID NO:12 or 13,provided the cytochrome P450 reductase polypeptides exhibit cytochromeP450 reductase activity (i.e. transfer two electrons from NADPH to acytochrome P450). Percent identity can be determined by one skilled inthe art using standard alignment programs.

Also provided herein are active fragments of cytochrome P450 reductasepolypeptides having a sequence of amino acids set forth in SEQ ID NO:12or 13. For example, provided herein are truncated cytochrome P450reductase polypeptides having a sequence of amino acids set forth in SEQID NO:14 or 15. Such fragments retain one or more properties of acytochrome P450 reductase polypeptide. Typically, the active fragmentsexhibit cytochrome P450 reductase activity (i.e. transfer two electronsfrom NADPH to a cytochrome P450).

Also provided herein are nucleic acid molecules that have a sequence ofamino acids set forth in SEQ ID NO:10 or 11, or degenerates thereof,that encode a cytochrome P450 reductase polypeptide having a sequence ofamino acids set forth in SEQ ID NO:12 or 13, respectively. Also providedherein are nucleic acid molecules encoding a cytochrome P450 reductasepolypeptide having at least 85% sequence identity to a sequence ofnucleotides set forth in SEQ ID NO:10 or 11. For example, the nucleicacid molecules provided herein can exhibit at least or about at least85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 95%, 95%, 96%, 97%, 98% or99% or more sequence identity to a sequence of nucleotides set forth inSEQ ID NO:10 or 11, so long as the encoded cytochrome P450 reductasepolypeptide exhibits cytochrome P450 reductase activity (i.e. theability to transfer two electrons from NADPH to a cytochrome P450). Alsoprovided herein are degenerate sequences of the sequences set forth inSEQ ID NO:10 or 11 encoding a cytochrome P450 reductase polypeptidehaving a sequence of amino acids set forth in SEQ ID NO:12 or 13,respectively. Percent identity can be determined by one skilled in theart using standard alignment programs.

In some examples, the nucleic acid molecules that encode the cytochromeP450 reductase polypeptides are isolated from the sandalwood treeSantalum album. In other examples, the nucleic acid molecules andencoded cytochrome P450 reductase polypeptides are variants of thoseisolated from the sandalwood tree Santalum album.

2. Modified Cytochrome P450 Reductase Polypeptides

Provided herein are modified cytochrome P450 reductase polypeptides. Themodifications can be made in any region of a cytochrome P450 reductasepolypeptide provided the resulting modified cytochrome P450 reductasepolypeptides at least retain cytochrome P450 reductase activity (e.g.the ability to transfer two electrons from NADPH to a cytochrome P450).

The modifications can be a single amino acid modification, such assingle amino acid replacements (substitutions), insertions or deletions,or multiple amino acid modifications, such as multiple amino acidreplacements, insertions or deletions. In some examples, entire orpartial domains or regions, such as any domain or region describedherein below, are exchanged with corresponding domains or regions orportions thereof from another cytochrome P450 reductase polypeptide.Exemplary of modifications are amino acid replacements, including singleor multiple amino acid replacements. For example, modified cytochromeP450 reductase polypeptides provided herein can contain at least or 1,2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21,22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39,40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57,58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75,76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 90, 95, 100, 105, 110, 115, 120or more modified positions compared to the cytochrome P450 reductasepolypeptide not containing the modification.

The modifications described herein can be in any cytochrome P450reductase polypeptide. For example, the modifications described hereincan be in a cytochrome P450 reductase having a sequence of amino acidsset forth in any of SEQ ID NOS:12-15 or any variant thereof, includingany that have at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%,94%, 95%, 96%, 97%, 98%, or 99% sequence identity to a cytochrome P450reductase having a sequence of amino acids set forth in any of SEQ IDNOS:12-15.

In particular, modified cytochrome P450 reductase polypeptides providedherein contain amino acid replacements or substitutions, additions ordeletions, truncations or combinations thereof with reference to thecytochrome P450 reductase polypeptide having a sequence of amino acidsset forth in SEQ ID NO:12. It is within the level of one of skill in theart to make such modifications in cytochrome P450 reductasepolypeptides, such as any set forth in SEQ ID NOS:12-15 or any variantthereof. Based on this description, it is within the level of one ofskill in the art to generate a cytochrome P450 reductase polypeptidecontaining any one or more of the described mutations, and test each forcytochrome P450 reductase activity described herein, such as the abilityto transfer two electrons from NADPH to cytochrome P450.

Also, in some examples, provided herein are modified active fragments ofcytochrome P450 reductase polypeptides that contain any of themodifications provided herein. Such fragments retain on or moreproperties of a cytochrome P450 reductase, such as the ability totransfer two electrons from NADPH to cytochrome P450. Modifications in acytochrome P450 reductase polypeptide also can be made to a cytochromeP450 reductase polypeptide that also contains other modifications,including modifications of the primary sequence and modifications not inthe primary sequence of the polypeptide. For example, modificationdescribed herein can be in a cytochrome P450 reductase polypeptide thatis a fusion polypeptide or chimeric polypeptide with differentcytochrome P450 reductase polypeptides (e.g. contain one or more domainsor regions from another cytochrome P450 reductase s) and also syntheticcytochrome P450 reductase polypeptides prepared recombinantly orsynthesized or constructed by other methods known in the art based uponthe sequence of known polypeptides.

In some examples, the modifications are amino acid replacements. Infurther examples, the modified cytochrome P450 reductase polypeptidesprovided herein contain one or more modifications in a domain. Forexample, the modifications in a domain or structural domain can be byreplacement of corresponding heterologous residues from anothercytochrome P450 reductase polypeptide.

To retain cytochrome P450 reductase activity, modifications typicallyare not made at those positions necessary for cytochrome P450 reductaseactivity, i.e., in the catalytic center or in conserved residues. Forexample, generally modifications are not made a position correspondingto Ser485, Cys657, Asp702 and Trp704 with reference to a sequence ofamino acids set forth in SEQ ID NO:12.

The modified cytochrome P450 reductase polypeptides provided herein cancontain two or more modifications, including amino acid replacements orsubstitutions, insertions or deletions, truncations or combinationsthereof. Generally, multiple modifications provided herein can becombined by one of skill in the art so long as the modified cytochromeP450 reductase polypeptide retains cytochrome P450 reductase activity.

Also provided herein are nucleic acid molecules that encode any of themodified cytochrome P450 reductase polypeptides provided herein. Inparticular examples, the nucleic acid sequence can be codon optimized,for example, to increase expression levels of the encoded sequence. Theparticular codon usage is dependent on the host organism in which themodified polypeptide is expressed. One of skill in the art is familiarwith optimal codons for expression in bacteria or yeast, including forexample E. coli or Saccharomyces cerevisiae. For example, codon usageinformation is available from the Codon Usage Database available at, forexample, kazusa.or.jp.codon (see, e.g., Richmond (2000) Genome Biology,1:241 for a description of the database). See also, Forsburg (2004)Yeast, 10:1045-1047; Brown et al. (1991) Nucleic Acids Research,19:4298; Sharp et al. (1988) Nucleic Acids Research, 12:8207-8211; Sharpet al. (1991) Yeast, 657-78. In examples herein, nucleic acid sequencesprovided herein are codon optimized based on codon usage inSaccharomyces cerevisiae.

The modified polypeptides and encoding nucleic acid molecules providedherein can be produced by standard recombinant DNA techniques known toone of skill in the art. Any method known in the art to effect mutationof any one or more amino acids in a target protein can be employed.Methods include standard site-directed or random mutagenesis of encodingnucleic acid molecules, or solid phase polypeptide synthesis methods.For example, as described herein, nucleic acid molecules encoding acytochrome P450 reductase polypeptide can be subjected to mutagenesis,such as random mutagenesis of the encoding nucleic acid, by error-pronePCR, site-directed mutagenesis, overlap PCR, gene shuffling, or otherrecombinant methods. The nucleic acid encoding the polypeptides then canbe introduced into a host cell to be expressed heterologously. Hence,also provided herein are nucleic acid molecules encoding any of themodified polypeptides provided herein. In some examples, the modifiedcytochrome P450 reductase polypeptides are produced synthetically, suchas using solid phase or solutions phase peptide synthesis.

3. Additional Modifications

Provided herein are cytochrome P450 reductase polypeptides that containadditional modifications. For example, modified cytochrome P450reductase polypeptides include, for example, truncated cytochrome P450reductase polypeptides, cytochrome P450 reductase polypeptides havingaltered activities or properties, chimeric cytochrome P450 reductasepolypeptides, cytochrome P450 reductase polypeptides containing domainswaps, cytochrome P450 reductase fusion proteins, or cytochrome P450reductase polypeptides having any modification described elsewhereherein.

a. Truncated Polypeptides

Also provided herein are truncated cytochrome P450 reductasepolypeptides. The truncated cytochrome P450 reductase polypeptides canbe truncated at the N-terminus or C-terminus, so long as the truncatedcytochrome P450 reductase polypeptides retain the catalytic activity ofa cytochrome P450 reductase, such as cytochrome P450 reductase activity.Typically, the truncated cytochrome P450 reductase polypeptides exhibitcytochrome P450 reductase activity (i.e., the ability to transfer twoelectrons from NADPH to cytochrome P450). In some examples, thecytochrome P450 reductase polypeptides are truncated at the C-terminus.In other examples, the cytochrome P450 reductase polypeptides aretruncated at the N-terminus.

In some examples, the cytochrome P450 reductase polypeptides aretruncated at the N-terminus, C-terminus or both termini of a cytochromeP450 reductase polypeptide provided herein, such as truncation of asequence of amino acids set forth in any of SEQ ID NOS:12 or 13. Inother examples, any of the modified cytochrome P450 reductasepolypeptides provided herein are truncated. The modified cytochrome P450reductase polypeptides can be truncated at their N-terminus, C-terminus,or both termini. For example, any cytochrome P450 reductase polypeptideprovided herein can be truncated by at or about or at least 1, 2, 3, 4,5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 20, 21, 22, 23,24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41,42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59,60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75 or moreamino acid residues at the N-terminus, provided the cytochrome P450reductase polypeptide retains cytochrome P450 reductase activity. Inother examples, any cytochrome P450 reductase polypeptide providedherein can be truncated by at or about or at least 1, 2, 3, 4, 5, 6, 7,8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 20, 21, 22, 23, 24, 25, 26,27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44,45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62,63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75 or more amino acidresidues at the C-terminus, provided the cytochrome P450 reductasepolypeptide retains cytochrome P450 reductase activity. In someexamples, cytochrome P450 reductases can be truncated by digestion withpancreatic steapsin or trypsin, which releases the N-terminalhydrophobic anchor.

For example, provided herein are truncated cytochrome P450 reductasepolypeptides having a sequence of amino acids set forth in SEQ ID NO:14or 15. Also provided herein are truncated cytochrome P450 reductasepolypeptides having a sequence of amino acids having at least or atleast about 65%, 70%, 75%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%,89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% amino acidsequence identity to a truncated cytochrome P450 reductase having asequence of amino acids set forth in SEQ ID NO:14 or 15, provided theresulting cytochrome P450 reductase polypeptide at least retainscytochrome P450 reductase activity (i.e., the ability to transfer twoelectrons from NADPH to cytochrome P450). Also provided herein arenucleic acid molecules having a sequence of nucleotides set forth in SEQID NOS:63 or 64 that encode the truncated cytochrome P450 reductasepolypeptides having a sequence of amino acids set forth in SEQ ID NO:14or 15, respectively.

b. Polypeptides with Altered Activities or Properties

The modified cytochrome P450 reductase polypeptides provided herein canalso exhibit changes in activities and/or properties. The modifiedcytochrome P450 reductase polypeptides can exhibit, for example,improved properties, such as increased catalytic activity, increasedstability, and/or increased expression in a host cell. In otherexamples, the modified cytochrome P450 reductase polypeptide exhibits asimilar, increased and/or improved activity compared to the unmodifiedcytochrome P450 reductase polypeptide.

c. Domain Swaps

Provided herein are modified cytochrome P450 reductase polypeptides thatare chimeric polypeptides containing a swap (deletion and insertion) bydeletion of amino acid residues of one of more domains or regionstherein or portions thereof and insertion of a heterologous sequence ofamino acids. In some examples, the heterologous sequence is a randomizedsequence of amino acids. In other examples, the heterologous sequence isa contiguous sequence of amino acids for the corresponding domain orregion or portion thereof from another cytochrome P450 reductase. Theheterologous sequence that is replaced or inserted generally includes atleast 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38,39, 40, or more amino acids. In examples where the heterologous sequenceis from a corresponding domain or a portion thereof of anothercytochrome P450 reductase, the heterologous sequence generally includesat least 50%, 60%, 70%, 80%, 90%, 95% or more contiguous amino acids ofthe corresponding domain or region or portion. In such an example,adjacent residues to the heterologous corresponding domain or region orportion thereof also can be included in a modified cytochrome P450reductase polypeptide provided herein.

In one example of swap mutants provided herein, at least one domain orregion or portion thereof of a cytochrome P450 reductase polypeptide isreplaced with a contiguous sequence of amino acids for the correspondingdomain or region or portions thereof from another cytochrome P450reductase polypeptide. In some examples, 2, 3, 4, 5, 6, 7, 8, 9, 10 ormore domains or regions or portions thereof are replaced with acontiguous sequence of amino acids for the corresponding domain orregion or portions thereof from another cytochrome P450 reductasepolypeptide.

Any domain or region or portion thereof of a cytochrome P450 reductasepolypeptide can be replaced with a heterologous sequence of amino acids,such as heterologous sequence from the corresponding domain or regionfrom another cytochrome P450 reductase. A domain or region can be astructural domain or a functional domain. One of skill in the art isfamiliar with domains or regions in cytochrome P450 reductases.Functional domains include, for example, the catalytic domain or aportion thereof. A structural domain can include all or a portion ofα-helix A; β-strand 1; α-helix B; β-strand 2; α-helix C; β-strand 3;α-helix D; β-strand 4; α-helix E; β-strand 5; α-helix F; β-strand 6;β-strand 7; β-strand 8, β-strand 9; β-strand 10; α-helix G; β-strand 11;β-strand 12; β-strand 12′; α-helix H; α-helix I; α-helix J; α-helix K;α-helix M; β-strand 13; β-strand 14; β-strand 15; α-helix N; β-strand16; β-strand 16′; β-strand 17; α-helix O; β-strand 18; α-helix P;β-strand 10; α-helix Q; α-helix R; β-strand 20; α-helix S; α-helix T;and β-strand 21. One of skill in the art is familiar with variouscytochrome P450s and can identify corresponding domains or regions orportions of amino acids thereof. Typically, the resulting modifiedcytochrome P450 reductase polypeptides exhibit cytochrome P450 reductaseactivity.

Any methods known in the art for generating chimeric polypeptides can beused to replace all or a contiguous portion of a domain or a cytochromeP450 reductase with all or a contiguous portion of the correspondingdomain of a second cytochrome P450 reductase (see, U.S. Pat. Nos.5,824,774, 6,072,045, 7,186,891 and 8,106,260, and U.S. Pat. Pub. No.20110081703). Also, gene shuffling methods can be employed to generatechimeric polypeptides and/or polypeptides with domain or region swaps.

For example, corresponding domains or regions of any two cytochrome P450reductases can be exchanged using any suitable recombinant method knownin the art, or by in vitro synthesis. Exemplary of recombinant methodsis a two stage overlapping PCR method, such as described herein. In suchmethods, primers that introduce mutations at a plurality of codonpositions in the nucleic acids encoding the targeted domain or portionthereof in the first cytochrome P450 reductase can be employed; themutations together form the heterologous region (i.e. the correspondingregion from the second cytochrome P450 reductase). Alternatively, forexample, randomized amino acids can be used to replace particulardomains or regions. It is understood that primer errors, PCR errorsand/or other errors in the cloning or recombinant methods can result inerrors such that the resulting swapped or replaced region or domain doesnot exhibit an amino acid sequence that is identical to thecorresponding region from the second cytochrome P450 reductase synthase.

In an exemplary PCR-based method, the first stage PCR uses (i) adownstream primer that anneals downstream of the region that is beingreplaced with a mutagenic primer that includes approximately fifteennucleotides (or an effective number to effect annealing, such as 5, 6,7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 20, 25 nucleotides or more) ofhomologous sequence on each side of the domain or region to be exchangedor randomized flanking the region to be imported into the target gene,and (ii) an upstream primer that anneals upstream of the region that isbeing replaced together with an opposite strand mutagenic primer thatalso includes approximately fifteen nucleotides (or an effective numberto effect annealing, such as 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,17, 20, 25 nucleotides or more) of homologous sequence on each side ofthe domain or region to be exchanged or randomized flanking the regionto be imported into the target gene. If a replacement in which a domainor region of a first cytochrome P450 reductase gene is replaced with thecorresponding domain or region from a second cytochrome P450 reductaseis being performed, nucleotides in the mutagenic primers between theflanking regions from the first cytochrome P450 reductase contain codonsfor the corresponding region of the second cytochrome P450 reductase. Ininstances where the amino acids in a domain or region are to berandomized, nucleotides of the mutagenic primers between the flankingregions from the first cytochrome P450 reductase contains randomnucleotides. An overlapping PCR is then performed to join the twofragments, using the upstream and downstream oligo. The resulting PCRproduct then can be cloned into any suitable vector for expression ofthe modified cytochrome P450 reductase.

Further, any of the modified cytochrome P450 reductase polypeptidescontaining swap mutations herein can contain one or more further aminoacid replacements as described herein above.

d. Additional Variants

Cytochrome P450 reductase polypeptides provided herein can be modifiedby any method known to one of skill in the art for generating proteinvariants, including, but not limited to, DNA or gene shuffling, errorprone PCR, overlap PCR or other recombinant methods. In one example,nucleic acid molecules encoding any cytochrome P450 reductasepolypeptide or variant cytochrome P450 reductase polypeptide providedherein can be modified by gene shuffling. Gene shuffling involves one ormore cycles of random fragmentation and reassembly of at least twonucleotide sequences, followed by screening to select nucleotidesequences encoding polypeptides with desired properties. Therecombination can be performed in vitro (see Stemmer et al. (1994) ProcNatl Acad Sci USA 91:10747-10751; Stemmer et al. (1994) Nature370:389-391; Cramieri et al. (1998) Nature 391:288-291; U.S. Pat. Nos.5,605,793, 5,811,238, 5,830,721, 5,834,252 and 5,837,458) or in vivo(see, International Pat. Pub. No. WO199707205). The nucleic acidmolecules encoding the polypeptides then can be introduced into a hostcell to be expressed heterologously and tested for their cytochrome P450reductase activity by any method described in section G below.

e. Fusion or Chimeric Proteins

Nucleic acid molecules provided herein include fusion or chimericnucleic acid molecules that contain a cytochrome P450 polypeptide and acytochrome P450 reductase polypeptide. For example, provided herein arenucleic acid molecules encoding a fusion polypeptide that is capable ofcatalyzing the formation of a santalol or bergamotol, such as anα-santalol, β-santalol, epi-β-santalol or Z-α-trans-bergamotol, fromsantalenes or bergamotene that contains any cytochrome P450 polypeptideand any cytochrome P450 reductase polypeptide provided herein. Forexample, provided herein are nucleic acid molecules encoding a fusionpolypeptide that contains a cytochrome P450 polypeptide set forth in anyof SEQ ID NOS:6-9 and a cytochrome P450 reductase polypeptide set forthin any of SEQ ID NOS:12-15. Also provided herein are fusion polypeptidescontaining a cytochrome P450 polypeptide set forth in any of SEQ IDNOS:6-9 and a cytochrome P450 reductase polypeptide set forth in any ofSEQ ID NOS:12-15. The fusion polypeptides can be linked directly or viaa linker.

Nucleic acid molecules provided herein include fusion or chimericnucleic acid molecules that contain a santalene synthase, cytochromeP450 polypeptide and a cytochrome P450 reductase. For example, providedherein are nucleic acid molecules encoding a fusion polypeptide thatcontains a santalene synthase set forth in any of SEQ ID NOS:17, 52 or53, a cytochrome P450 santalene oxidase polypeptide set forth in SEQ IDNO:7 and a cytochrome P450 reductase set forth in any of SEQ IDNOS:12-15. Also provided herein are fusion polypeptides containing asantalene synthase set forth in any of SEQ ID NOS: 17, 52 or 53, acytochrome P450 santalene oxidase polypeptide set forth in SEQ ID NO:7and a cytochrome P450 reductase set forth in any of SEQ ID NOS:12-15. Inanother example, provided herein are nucleic acid molecules encoding afusion polypeptide that contains a santalene synthase set forth in anyof SEQ ID NOS:17, 52 or 53, a cytochrome P450 bergamotene oxidasepolypeptide set forth in any of SEQ ID NOS:6, 8 or 9 and a cytochromeP450 reductase set forth in any of SEQ ID NOS:12-15. Also providedherein are fusion polypeptides containing a santalene synthase set forthin any of SEQ ID NOS: 17, 52 or 53, a cytochrome P450 bergamoteneoxidase polypeptide set forth in any of SEQ ID NOS:6, 8 or 9 and acytochrome P450 reductase set forth in any of SEQ ID NOS:12-15. Thefusion polypeptides can be linked directly or via a linker.

In another example, provided herein is a nucleic acid molecule thatencodes a santalene synthase, a cytochrome P450 and/or a cytochrome P450reductase, such that, when expressed in a host cell, a bacterial oryeast host cell, a santalene synthase, a cytochrome P450 and/or acytochrome P450 reductase are expressed. In one another example,provided herein is a nucleic acid molecule that encodes a santalenesynthase, a cytochrome P450 santalene oxidase and a cytochrome P450reductase. In another example, provided herein is a nucleic acidmolecule that encodes a santalene synthase, a cytochrome P450bergamotene oxidase and a cytochrome P450 reductase. Further, when thehost cell is capable of producing FPP, the encoded polypeptides catalyzethe production of santalols and/or bergamotols.

Other examples of fusion proteins include, but are not limited to,fusions of a signal sequence, a tag such as for localization, e.g. ahis₆ tag or a myc tag, or a tag for purification, for example, a GSTfusion, GFP fusion or CBP fusion, and a sequence for directing proteinsecretion and/or membrane association.

E. METHODS FOR PRODUCING MODIFIED CYTOCHROME P450 AND CYTOCHROME P450REDUCTASE POLYPEPTIDES AND ENCODING NUCLEIC ACID MOLECULES

Provided are methods for producing modified cytochrome P450 andcytochrome P450 reductase polypeptides, including santalene oxidase andbergamotene oxidase polypeptides. The methods can be used to generatecytochrome P450s and cytochrome P450 reductases with desired properties,including, but not limited to, increased catalytic activity, increasedselectivity, increased substrate specificity, increased substratebinding, increased stability, increased expression in a host cell,altered product distribution and/or altered substrate specificity.Modified cytochrome P450s and cytochrome P450 reductases can be producedusing any method known in the art and, optionally, screened for thedesired properties. In particular examples, modified cytochrome P450sand cytochrome P450 reductases with desired properties are generated bymutation in accord with the methods exemplified herein. Thus, providedherein are modified cytochrome P450s and cytochrome P450 reductases andnucleic acid molecules encoding the modified cytochrome P450s andcytochrome P450 reductases that are produced using the methods describedherein.

Exemplary of the methods provided herein are those in which modifiedcytochrome P450s and cytochrome P450 reductases are produced byreplacing one or more endogenous domains or regions of a firstcytochrome P450 or cytochrome P450 reductase with the correspondingdomain(s) or regions(s) from a second cytochrome P450 or cytochrome P450reductase (i.e. heterologous domains or regions). In further examples,two or more endogenous domains or regions of a first cytochrome P450 orcytochrome P450 reductase are replaced with the correspondingheterologous domain(s) or regions(s) from two or more other cytochromeP450s or cytochrome P450 reductases, such as a second, third, fourth,fifth, sixth, seventh, eighth, ninth, or tenth cytochrome P450s orcytochrome P450 reductases. Thus, the resulting modified cytochrome P450or cytochrome P450 reductase can include heterologous domains or regionsfrom 1, 2, 3, 4, 5, 6, 7, 8, 9 or more different cytochrome P450s orcytochrome P450 reductases. In further examples, the methods also orinstead include replacing one or more domains or regions of a firstcytochrome P450 or cytochrome P450 reductase synthase with randomizedamino acid residues.

Any cytochrome P450 or cytochrome P450 reductase can be used in themethods provided herein. The first cytochrome P450 or cytochrome P450reductase (i.e. the cytochrome P450 or cytochrome P450 reductase to bemodified) can be of the same or different class as the second (or third,fourth, fifth, etc.) cytochrome P450 or cytochrome P450 reductase (i.e.the cytochrome P450(s) or cytochrome P450 reductase(s) from which theheterologous domain(s) or region(s) is derived).

In practicing the methods provided herein, all or a contiguous portionof an endogenous domain of a first cytochrome P450 or cytochrome P450reductase can be replaced with all or a contiguous portion of thecorresponding heterologous domain from a second cytochrome P450 orcytochrome P450 reductase. For example, 3, 4, 5, 6, 7, 8, 9, 10 or morecontiguous amino acids from a domain or region in a first cytochromeP450 or cytochrome P450 reductase can be replaced with 3, 4, 5, 6, 7, 8,9, 10 or more contiguous amino acids from the corresponding region froma second cytochrome P450 or cytochrome P450 reductase. In some examples,one or more amino acid residues adjacent to the endogenous domain of thefirst cytochrome P450 or cytochrome P450 reductase also are replaced,and/or one or more amino acid residues adjacent to the heterologousdomain also are used in the replacement. Further, the methods providedherein also include methods in which all or a contiguous portion of afirst domain and all or a contiguous portion of a second adjacent domainare replaced with the corresponding domains (or portions thereof) fromanother cytochrome P450 or cytochrome P450 reductase.

Domains or regions that can be replaced include functional domains orstructural domains. Exemplary domains or regions that can be replaced ina cytochrome P450 using the methods described herein include, but arenot limited to, structural domains or regions corresponding to helix A,β strand 1-1, β strand 1-2, helix B, β strand 1-5, helix B′, helix C,helix C′, helix D, β strand 3-1, helix E, helix F, helix G, helix H, βstrand 5-1, β strand 5-2, helix I, helix J, helix J′, helix K, β strand1-4, β strand 2-1, β strand 2-2, β strand 1-3, helix K′, helix K″, Hemedomain, helix L, β strand 3-3, β strand 4-1, β strand 4-2 and β strand3-2. Any one or more of these domains or regions, or a portion thereof,can be replaced with a corresponding domain from another cytochrome P450using the methods provided herein. These domains are regions can beidentified in any cytochrome P450 using methods well known in the art,such as, for example, by alignment using methods known to those of skillin the art (see, e.g., FIG. 5A-5B). Such methods typically maximizematches, and include methods such as using manual alignments and byusing the numerous alignment programs available (for example, BLASTP)and others known to those of skill in the art. By aligning the sequencesof the cytochrome P450 set forth in SEQ ID NO:50, and any othercytochrome P450, any of the domains or regions recited above can beidentified in any cytochrome P450.

Exemplary domains or regions that can be replaced in a cytochrome P450reductase using the methods described herein include, but are notlimited to, structural domains or regions corresponding to α-helix A;β-strand 1; α-helix B; β-strand 2; α-helix C; β-strand 3; α-helix D;β-strand 4; α-helix E; β-strand 5; α-helix F; β-strand 6; β-strand 7;β-strand 8, β-strand 9; β-strand 10; α-helix G; β-strand 11; β-strand12; β-strand 12′; α-helix H; α-helix I; α-helix J; α-helix K; α-helix M;β-strand 13; β-strand 14; β-strand 15; α-helix N; β-strand 16; β-strand16′; β-strand 17; α-helix O; β-strand 18; α-helix P; β-strand 10;α-helix Q; α-helix R; β-strand 20; α-helix S; α-helix T; and β-strand21. These domains are regions can be identified in any cytochrome P450reductase using methods well known in the art, such as, for example, byalignment using methods known to those of skill in the art (see, e.g.,FIGS. 3A-3C). Such methods typically maximize matches, and includemethods such as using manual alignments and by using the numerousalignment programs available (for example, BLASTP) and others known tothose of skill in the art. By aligning the sequences of the cytochromeP450 reductase set forth in SEQ ID NO:12, and any other cytochrome P450reductase, any of the domains or regions recited above can be identifiedin any cytochrome P450 reductase.

In the methods provided herein, all or a contiguous portion of anendogenous domain of a first cytochrome P450 or cytochrome P450reductase can be replaced with all or a contiguous portion of thecorresponding heterologous domain from a second cytochrome P450 orcytochrome P450 reductase using an suitable recombinant method known inthe art as discussed above in Sections C.4.c. and D.3.c.

F. EXPRESSION OF CYTOCHROME P450 AND CYTOCHROME P450 REDUCTASEPOLYPEPTIDES AND ENCODING NUCLEIC ACID MOLECULES

Cytochrome P450 and cytochrome P450 reductase polypeptides and activefragments thereof, including cytochrome P450 santalene oxidase andcytochrome P450 bergamotene oxidase polypeptides, can be obtained bymethods well known in the art for recombinant protein generation andexpression. Such cytochrome P450 santalene oxidase polypeptides can beused to produce santalols from santalenes in a host cell from which thecytochrome P450 santalene oxidase is expressed or in vitro followingpurification of the cytochrome P450 santalene oxidase polypeptide. Suchcytochrome P450 bergamotene oxidase polypeptides can be used to producebergamotols from bergamotenes in a host cell from which the cytochromeP450 bergamotene oxidase is expressed or in vitro following purificationof the cytochrome P450 bergamotene oxidase polypeptide. Such cytochromeP450 santalene oxidase and cytochrome P450 bergamotene oxidasepolypeptides can be used to produce santalols or bergamotols from asuitable acyclic pyrophosphate precursor, such as FPP, in a host cell inwhich a santalene synthase and the cytochrome P450 are expressed. Anymethod known to those of skill in the art for identification of nucleicacids that encode desired genes can be used to obtain the nucleic acidencoding a cytochrome P450, such as a cytochrome P450 santalene oxidaseor cytochrome P450 bergamotene oxidase, or cytochrome P450 reductase.For example, nucleic acid encoding unmodified or wild type cytochromeP450 polypeptides or cytochrome P450 reductase polypeptides can beobtained using well known methods from a plant source, such as Santalumalbum. Modified cytochrome P450 polypeptides or cytochrome P450reductase polypeptides then can be engineered using any method known inthe art for introducing mutations into unmodified or wild typecytochrome P450 polypeptides or cytochrome P450 reductase polypeptides,including any method described herein, such as random mutagenesis of theencoding nucleic acid by error-prone PCR, site-directed mutagenesis,overlap PCR, or other recombinant methods. The nucleic acids encodingthe polypeptides then can be introduced into a host cell to be expressedheterologously.

In some examples, the cytochrome P450 polypeptides or cytochrome P450reductase polypeptides provided herein, including cytochrome P450santalene oxidase and cytochrome P450 bergamotene oxidase polypeptides,are produced synthetically, such as using sold phase or solution phasepeptide synthesis.

1. Isolation of Nucleic Acid Encoding Santalum Album Cytochrome P450 andCytochrome P450 Reductase Polypeptides

Nucleic acids encoding cytochrome P450s or cytochrome P450 reductases,such as cytochrome P450 santalene oxidase and cytochrome P450bergamotene oxidase, can be cloned or isolated using any availablemethods known in the art for cloning and isolating nucleic acidmolecules. Such methods include PCR amplification of nucleic acids andscreening of libraries, including nucleic acid hybridization screening.In some examples, methods for amplification of nucleic acids can be usedto isolate nucleic acid molecules encoding a cytochrome P450 orcytochrome P450 reductase polypeptide, including for example, polymerasechain reaction (PCR) methods. A nucleic acid containing material can beused as a starting material from which a cytochrome P450 or cytochromeP450 reductase-encoding nucleic acid molecule can be isolated. Forexample, DNA and mRNA preparations from Santalum species, including butnot limited to Santalum album can be used to obtain cytochrome P450 orcytochrome P450 reductase genes. Nucleic acid libraries also can be usedas a source of starting material. Primers can be designed to amplify acytochrome P450 or cytochrome P450 reductase-encoding molecule, such asa cytochrome P450 santalene oxidase, cytochrome P450 bergamotene oxidaseor cytochrome P450 reductase-encoding molecule. For example, primers canbe designed based on known nucleic acid sequences encoding a cytochromeP450 such as those set forth in SEQ ID NOS:22-25. In another example,primers can be designed based on known nucleic acid sequences encoding acytochrome P450 reductase such as those set forth in SEQ ID NOS:40-41.Nucleic acid molecules generated by amplification can be sequenced andconfirmed to encode a cytochrome P450 or cytochrome P450 reductasepolypeptide. The nucleic acid molecules provided herein can be used toidentify related nucleic acid molecules in other species.

Additional nucleotide sequences can be joined to a cytochrome P450 orcytochrome P450 reductase-encoding nucleic acid molecule, includinglinker sequences containing restriction endonuclease sites for thepurpose of cloning the synthetic gene into a vector, for example, aprotein expression vector or a vector designed for the amplification ofthe core protein coding DNA sequences. Furthermore, additionalnucleotide sequences specifying functional DNA elements can beoperatively linked to a cytochrome P450 or cytochrome P450reductase-encoding nucleic acid molecule. Still further, nucleic acidencoding other moieties or domains also can be included so that theresulting synthase is a fusion protein. For example, nucleic acidsencoding other enzymes, such as FPP synthase or santalene synthase, orprotein purification tags, such as His or Flag tags.

2. Generation of Modified Nucleic Acid

Nucleic acid encoding a cytochrome P450 or cytochrome P450 reductase,such as a modified cytochrome P450 santalene oxidase polypeptides,modified cytochrome P450 bergamotene oxidase polypeptides or modifiedcytochrome P450 reductase polypeptides, can be prepared or generatedusing any method known in the art to effect mutation. Methods formodification include standard rational and/or random mutagenesis ofencoding nucleic acid molecules (using e.g., error prone PCR, randomsite-directed saturation mutagenesis, DNA shuffling or rationalsite-directed mutagenesis, such as, for example, mutagenesis kits (e.g.QuikChange available from Stratagene)). In addition, routine recombinantDNA techniques can be used to generate nucleic acids encodingpolypeptides that contain heterologous amino acid. For example, nucleicacid encoding chimeric polypeptides or polypeptides containingheterologous amino acid sequence, can be generated using a two-step PCRmethod, such as described above, and/or using restriction enzymes andcloning methodologies for routine subcloning of the desired chimericpolypeptide components.

Once generated, the nucleic acid molecules can be expressed in cells togenerate modified cytochrome P450 or cytochrome P450 reductasepolypeptides using any method known in the art. The modified cytochromeP450 or cytochrome P450 reductase polypeptides, such as modifiedcytochrome P450 santalene oxidase polypeptides, modified cytochrome P450bergamotene oxidase polypeptides or modified cytochrome P450 reductasepolypeptides, then can be assessed by screening for a desired propertyor activity, for example, for the ability to produce a terpenoid from aterpene substrate. In particular examples, modified cytochrome P450 orcytochrome P450 reductase polypeptides with desired properties aregenerated by mutation and screened for a property in accord with theexamples exemplified herein. Typically, in instances where a modifiedcytochrome P450 santalene oxidase polypeptide is generated, the modifiedcytochrome P450 santalene oxidase polypeptides produce a santalol from asantalene. Typically, in instances where a modified cytochrome P450bergamotene oxidase polypeptide is generated, the modified cytochromeP450 bergamotene oxidase polypeptides produce a bergamotol from abergamotene.

3. Vectors and Cells

For recombinant expression of one or more of the cytochrome P450 orcytochrome P450 reductase polypeptides provided herein, includingcytochrome P450 santalene oxidase, cytochrome P450 bergamotene oxidaseor cytochrome P450 reductase polypeptides, the nucleic acid containingall or a portion of the nucleotide sequence encoding the synthase can beinserted into an appropriate expression vector, i.e., a vector thatcontains the necessary elements for the transcription and translation ofthe inserted protein coding sequence. Depending upon the expressionsystem used, the necessary transcriptional and translational signalsalso can be supplied by the native promoter for a cytochrome P450 orcytochrome P450 reductase gene, and/or their flanking regions. Thus,also provided herein are vectors that contain nucleic acid encoding anycytochrome P450 or cytochrome P450 reductase polypeptide providedherein. Exemplary vectors include but are not limited to pESC-LEU,pESC-LEU2d, and pYEDP60.

Cells, including prokaryotic and eukaryotic cells, containing the vectoralso are provided. Also provided are host cells containing nucleic acidmolecules encoding cytochrome P450 polypeptides provided herein,including cytochrome P450 santalene oxidases, cytochrome P450bergamotene oxidases and cytochrome P450 reductases. Such cells and hostcells include bacterial cells, yeast cells, fungal cells, Archea, plantcells, insect cells and animal cells. In particular examples, the cellsor host cells are yeast cells, such as Saccharomyces cerevisiae orPichia pastoris cells. In particular examples, the cells or host cellsare Saccharomyces cerevisiae cells that express an acyclic pyrophosphateterpene precursor, such as farnesyl diphosphate (FPP). In some examples,the cells or host cells containing a cytochrome P450 provided herein canbe modified to produce more FPP than an unmodified cell.

The cells are used to produce a cytochrome P450 or cytochrome P450reductase polypeptide, such as cytochrome P450 santalene oxidase,cytochrome P450 bergamotene oxidase or cytochrome P450 reductasepolypeptides, by growing the above-described cells under conditionswhereby the encoded cytochrome P450 or cytochrome P450 reductase isexpressed by the cell. In some examples, the cytochrome P450polypeptide, such as cytochrome P450 santalene oxidase, cytochrome P450bergamotene oxidase or cytochrome P450 reductase polypeptide, areheterologous to the cell. In some instances, the expressed cytochromeP450 and/or cytochrome P450 reductases are purified. In other instances,the expressed cytochrome P450s and cytochrome P450 reductases, convertone or more santalenes or bergamotenes to one or more santalols orbergamotols in the host cell. In some examples, a santalene synthase, acytochrome P450 santalene oxidase and a cytochrome P450 reductase areexpressed thereby converting the acyclic pyrophosphate terpene precursorFPP to santalol. In other examples, a santalene synthase, a cytochromeP450 bergamotene oxidase and a cytochrome P450 reductase are expressedthereby converting the acyclic pyrophosphate terpene precursor FPP tobergamotol.

Any method known to those of skill in the art for the insertion of DNAfragments into a vector can be used to construct expression vectorscontaining a chimeric gene containing appropriatetranscriptional/translational control signals and protein codingsequences. These methods can include in vitro recombinant DNA andsynthetic techniques and in vivo recombinants (genetic recombination).Expression of nucleic acid sequences encoding a cytochrome P450 orcytochrome P450 reductase polypeptide or modified cytochrome P450 orcytochrome P450 reductase polypeptide, or domains, derivatives,fragments or homologs thereof, can be regulated by a second nucleic acidsequence so that the genes or fragments thereof are expressed in a hosttransformed with the recombinant DNA molecule(s). For example,expression of the proteins can be controlled by any promoter/enhancerknown in the art. In one embodiment, the promoter is not native to thegenes for a cytochrome P450 or cytochrome P450 reductase protein.Promoters that can be used include but are not limited to prokaryotic,yeast, mammalian and plant promoters. The type of promoter depends uponthe expression system used, described in more detail below.

In one embodiment, a vector is used that contains a promoter operablylinked to nucleic acids encoding a cytochrome P450 or cytochrome P450reductase polypeptide or modified cytochrome P450 or cytochrome P450reductase polypeptide, or a domain, fragment, derivative or homolog,thereof, one or more origins of replication, and optionally, one or moreselectable markers (e.g., an antibiotic resistance gene). Vectors andsystems for expression of cytochrome P450 or cytochrome P450 reductasepolypeptides are described.

4. Expression Systems

Cytochrome P450 or cytochrome P450 reductase polypeptides, includingcytochrome P450 santalene oxidase, cytochrome P450 bergamotene oxidaseor cytochrome P450 reductase polypeptides (modified and unmodified) canbe produced by any methods known in the art for protein productionincluding in vitro and in vivo methods such as, for example, theintroduction of nucleic acid molecules encoding the cytochrome P450 orcytochrome P450 reductase (e.g. cytochrome P450 santalene oxidase,cytochrome P450 bergamotene oxidase or cytochrome P450 reductase) into ahost cell or host plant for in vivo production or expression fromnucleic acid molecules encoding the cytochrome P450 or cytochrome P405reductases (e.g. cytochrome P450 santalene oxidase, cytochrome P450bergamotene oxidase or cytochrome P450 reductase) in vitro. CytochromeP450 or cytochrome P450 reductase polypeptides such as cytochrome P450santalene oxidase, cytochrome P450 bergamotene oxidase or cytochromeP450 reductase and modified cytochrome P450 santalene oxidase,cytochrome P450 bergamotene oxidase or cytochrome P450 reductasepolypeptides can be expressed in any organism suitable to produce therequired amounts and forms of a synthase polypeptide. Expression hostsinclude prokaryotic and eukaryotic organisms such as E. coli, yeast,plants, insect cells, mammalian cells, including human cell lines andtransgenic animals. Expression hosts can differ in their proteinproduction levels as well as the types of post-translationalmodifications that are present on the expressed proteins. The choice ofexpression host can be made based on these and other factors, such asregulatory and safety considerations, production costs and the need andmethods for purification.

Expression in eukaryotic hosts can include expression in yeasts such asthose from the Saccharomyces genus (e.g. Saccharomyces cerevisiae) andPichia genus (e.g. Pichia pastoris), insect cells such as Drosophilacells and lepidopteran cells, plants and plant cells such as citrus,tobacco, corn, rice, algae, and lemna. Eukaryotic cells for expressionalso include mammalian cells lines such as Chinese hamster ovary (CHO)cells or baby hamster kidney (BHK) cells. Eukaryotic expression hostsalso include production in transgenic animals, for example, includingproduction in serum, milk and eggs.

Many expression vectors are available and known to those of skill in theart for the expression of a cytochrome P450 or cytochrome P450reductase, such as cytochrome P450 santalene oxidase, cytochrome P450bergamotene oxidase or cytochrome P450 reductase. Exemplary ofexpression vectors are those encoding a santalene synthase and a FPPsynthase, including the vectors described in Example 7. The choice ofexpression vector is influenced by the choice of host expression system.Such selection is well within the level of skill of the skilled artisan.In general, expression vectors can include transcriptional promoters andoptionally enhancers, translational signals, and transcriptional andtranslational termination signals. Expression vectors that are used forstable transformation typically have a selectable marker which allowsselection and maintenance of the transformed cells. In some cases, anorigin of replication can be used to amplify the copy number of thevectors in the cells.

Cytochrome P450 or cytochrome P450 reductase polypeptides, includingcytochrome P450 santalene oxidase, cytochrome P450 bergamotene oxidaseor cytochrome P450 reductase polypeptides and modified cytochrome P450santalene oxidase, cytochrome P450 bergamotene oxidase or cytochromeP450 reductase polypeptides, also can be used or expressed as proteinfusions. For example, a fusion can be generated to add additionalfunctionality to a polypeptide. Examples of fusion proteins include, butare not limited to, fusions of a signal sequence, a tag such as forlocalization, e.g. a his₆ tag or a myc tag, or a tag for purification,for example, a GST fusion, GFP fusion or CBP fusion, and a sequence fordirecting protein secretion and/or membrane association.

Methods of production of cytochrome P450 and cytochrome P450 reductasepolypeptides, including cytochrome P450 santalene oxidase, cytochromeP450 bergamotene oxidase or cytochrome P450 reductase polypeptides, caninclude co-expression of an acyclic pyrophosphate terpene precursor,such as FPP, in the host cell. In some instances, the host cellnaturally expresses FPP. Such a cell can be modified to express greaterquantities of FPP (see e.g. U.S. Pat. Nos. 6,531,303, 6,689,593,7,838,279 and 7,842,497). In other instances, a host cell that does notnaturally produce FPP is modified genetically to produce FPP.

a. Prokaryotic Cells

Prokaryotes, especially E. coli, provide a system for producing largeamounts of the cytochrome P450 and cytochrome P450 reductasepolypeptides provided herein. Transformation of E. coli is a simple andrapid technique well known to those of skill in the art. Exemplaryexpression vectors for transformation of E. coli cells, include, forexample, the pGEM expression vectors, the pQE expression vectors, andthe pET expression vectors (see, U.S. Pat. No. 4,952,496; available fromNovagen, Madison, Wis.; see, also literature published by Novagendescribing the system). Such plasmids include pET 11a, which containsthe T7lac promoter, T7 terminator, the inducible E. coli lac operator,and the lac repressor gene; pET 12a-c, which contains the T7 promoter,T7 terminator, and the E. coli ompT secretion signal; pET 15b and pET19b(Novagen, Madison, Wis.), which contain a His-Tag™ leader sequence foruse in purification with a His column and a thrombin cleavage site thatpermits cleavage following purification over the column, the T7-lacpromoter region and the T7 terminator; pACYC-Duet (Novagen, Madison,Wis.; SEQ ID NO:45).

Expression vectors for E. coli can contain inducible promoters that areuseful for inducing high levels of protein expression and for expressingproteins that exhibit some toxicity to the host cells. Exemplaryprokaryotic promoters include, for example, the β-lactamase promoter(Jay et al., (1981) Proc. Natl. Acad. Sci. USA 78:5543) and the tacpromoter (DeBoer et al., (1983) Proc. Natl. Acad. Sci. USA 80:21-25);see also “Useful Proteins from Recombinant Bacteria”: in ScientificAmerican 242:79-94 (1980)). Examples of inducible promoters include thelac promoter, the trp promoter, the hybrid tac promoter, the T7 and SP6RNA promoters and the temperature regulated λP_(L) promoter.

Cytochrome P450s and cytochrome P450 reductases, including cytochromeP450 santalene oxidase polypeptides, cytochrome P450 bergamotene oxidasepolypeptides and cytochrome P450 reductase polypeptides, can beexpressed in the cytoplasmic environment of E. coli. The cytoplasm is areducing environment and for some molecules, this can result in theformation of insoluble inclusion bodies. Reducing agents such asdithiothreitol and β-mercaptoethanol and denaturants (e.g., such asguanidine-HCl and urea) can be used to resolubilize the proteins. Analternative approach is the expression of cytochrome P450s andcytochrome P450 reductases in the periplasmic space of bacteria whichprovides an oxidizing environment and chaperonin-like and disulfideisomerases leading to the production of soluble protein. Typically, aleader sequence is fused to the protein to be expressed which directsthe protein to the periplasm. The leader is then removed by signalpeptidases inside the periplasm. Examples of periplasmic-targetingleader sequences include the pelB leader from the pectate lyase gene andthe leader derived from the alkaline phosphatase gene. In some cases,periplasmic expression allows leakage of the expressed protein into theculture medium. The secretion of proteins allows quick and simplepurification from the culture supernatant. Proteins that are notsecreted can be obtained from the periplasm by osmotic lysis. Similar tocytoplasmic expression, in some cases proteins can become insoluble anddenaturants and reducing agents can be used to facilitate solubilizationand refolding. Temperature of induction and growth also can influenceexpression levels and solubility. Typically, temperatures between 25° C.and 37° C. are used. Mutations also can be used to increase solubilityof expressed proteins. Typically, bacteria produce aglycosylatedproteins.

b. Yeast Cells

Yeast systems, such as, but not limited to, those from the Saccharomycesgenus (e.g. Saccharomyces cerevisiae), Schizosaccharomyces pombe,Yarrowia lipolytica, Kluyveromyces lactis, and Pichia pastoris can beused to express the cytochrome P450s and cytochrome P450 reductases,such as cytochrome P450 santalene oxidase polypeptides, cytochrome P450bergamotene oxidase polypeptides and cytochrome P450 reductasepolypeptides and modified cytochrome P450 santalene oxidasepolypeptides, cytochrome P450 bergamotene oxidase polypeptides andcytochrome P450 reductase polypeptides, provided herein. Yeastexpression systems also can be used to produce terpenes whose reactionsare catalyzed by the synthases. Yeast can be transformed with episomalreplicating vectors or by stable chromosomal integration by homologousrecombination. In some examples, inducible promoters are used toregulate gene expression. Exemplary promoter sequences for expression ofcytochrome P450 and cytochrome P450 reductase polypeptides in yeastinclude, among others, promoters for metallothionine, 3-phosphoglyceratekinase (Hitzeman et al. (1980) J. Biol. Chem. 255:2073), or otherglycolytic enzymes (Hess et al. (1968) J. Adv. Enzyme Reg. 7:149; andHolland et al. (1978) Biochem. 17:4900), such as enolase, glyceraldehydephosphate dehydrogenase, hexokinase, pyruvate decarboxylase,phosphofructokinase, glucose phosphate isomerase, 3-phosphoglyceratemutase, pyruvate kinase, triosephosphate isomerase, phosphoglucoseisomerase, and glucokinase.

Other suitable vectors and promoters for use in yeast expression arefurther described in Hitzeman, EPA-73,657 or in Fleer et al. (1991)Gene, 107:285-195; and van den Berg et al. (1990) Bio/Technology,8:135-139. Another alternative includes, but is not limited to, theglucose-repressible ADH2 promoter described by Russell et al. (J. Biol.Chem. 258:2674, 1982) and Beier et al. (Nature 300:724, 1982), or amodified ADH1 promoter. Shuttle vectors replicable in yeast and E. colican be constructed by, for example, inserting DNA sequences from pBR322for selection and replication in E. coli (Amp^(r) gene and origin ofreplication) into the above-described yeast vectors.

Yeast expression vectors can include a selectable marker such as LEU2,TRP1, HIS3, and URA3 for selection and maintenance of the transformedDNA. Exemplary vectors include pESC-Leu, pESC-Leu2D, pESC-His andpYEDP60. Proteins expressed in yeast are often soluble and co-expressionwith chaperonins, such as Bip and protein disulfide isomerase, canimprove expression levels and solubility. Additionally, proteinsexpressed in yeast can be directed for secretion using secretion signalpeptide fusions such as the yeast mating type alpha-factor secretionsignal from Saccharomyces cerevisiae and fusions with yeast cell surfaceproteins such as the Aga2p mating adhesion receptor or the Arxulaadeninivorans glucoamylase. A protease cleavage site (e.g., the Kex-2protease) can be engineered to remove the fused sequences from thepolypeptides as they exit the secretion pathway.

Yeast naturally express the required proteins, including FPP synthase(ERG20; which can produce FPP) for the mevalonate-dependent isoprenoidbiosynthetic pathway. Thus, expression of the cytochrome P450s andcytochrome P450 reductases, including cytochrome P450 santalene oxidasepolypeptides, cytochrome P450 bergamotene oxidase polypeptides andcytochrome P450 reductase polypeptides provided herein, in yeast cellscan result in the production of sesquiterpenes, such as santalenes andbergamotenes from FPP, and santalols and bergamotols. Exemplary yeastcells for the expression of cytochrome P450s and cytochrome P450reductases, including cytochrome P450 santalene oxidase polypeptides,cytochrome P450 bergamotene oxidase polypeptides and cytochrome P450reductase polypeptides, include yeast modified to express increasedlevels of FPP. For example, yeast cells can be modified to produce lesssqualene synthase or less active squalene synthase (e.g. erg9 mutants;see e.g. U.S. Pat. Nos. 6,531,303 and 6,689,593). This results inaccumulation of FPP in the host cell at higher levels compared to wildtype yeast cells, which in turn can result in increased yields ofsesquiterpenes and sesquiterpenoids (e.g. santalenes, bergamotenes,santalols and bergamotols). In another example, yeast cells can bemodified to produce more FPP synthase by introduction of a FPP synthasegene, such as SaFPPS from Santalum album (SEQ ID NO:18). In someexamples, the native FPP gene in such yeast can be deleted. Othermodifications that enable increased production of FPP in yeast include,for example, but are not limited to, modifications that increaseproduction of acetyl CoA, inactivate genes that encode enzymes that useFPP and GPP as substrate and overexpress HMG-CoA reductases, asdescribed in U.S. Pat. No. 7,842,497. Exemplary modified yeast cellsinclude, but are not limited to, YPH499 (MATa, ura3-52, lys2-801,ade2-101, trpl-Δ63, his3-Δ200, leu2-Δ1), WAT11 (MATa, ade2-1,his3-11,-15; leu2-3,-112, ura3-1, canR, cyr+; containing chromosomallyintegrated Arabidopsis NADPH-dependent P450 reductase ATR1; see Pomponet al. (1995) Toxicol Lett 82-83:815-822; Ro et al. (2005) Proc NatlAcad Sci USA 102:8060-8065); and BY4741 (MATa, his3Δ1, leu2Δ0, met15Δ0,ura3Δ0; ATCC #201388), modified Saccharomyces cerevisiae strains CALI5-1(ura3, leu2, his3, trp1, Δ erg9::HIS3, HMG2cat/TRP1::rDNA, dpp1, sue),ALX7-95 (ura3, his3, trp1, Δerg9::HIS3, HMG2cat/TRP1::rDNA, dpp1 sue),ALX11-30 (ura3, trp1, erg9^(def)25, HMG2cat/TRP1::rDNA, dpp1, sue),which are known and described in one or more of U.S. Pat. Nos.6,531,303, 6,689,593, 7,838,279, 7,842,497, and U.S. Pat. publicationNos. 20040249219 and 20110189717.

c. Plants and Plant Cells

Transgenic plant cells and plants can be used for the expression ofcytochrome P450s and cytochrome P450 reductases, including cytochromeP450 santalene oxidase polypeptides, cytochrome P450 bergamotene oxidasepolypeptides and cytochrome P450 reductase polypeptides provided herein.Expression constructs are typically transferred to plants using directDNA transfer such as microprojectile bombardment and PEG-mediatedtransfer into protoplasts, and with agrobacterium-mediatedtransformation. Expression vectors can include promoter and enhancersequences, transcriptional termination elements, and translationalcontrol elements. Expression vectors and transformation techniques areusually divided between dicot hosts, such as Arabidopsis and tobacco,and monocot hosts, such as corn and rice. Examples of plant promotersused for expression include the cauliflower mosaic virus promoter, thenopaline synthase promoter, the ribose bisphosphate carboxylase promoterand the ubiquitin and UBQ3 promoters. Selectable markers such ashygromycin, phosphomannose isomerase and neomycin phosphotransferase areoften used to facilitate selection and maintenance of transformed cells.Transformed plant cells can be maintained in culture as cells,aggregates (callus tissue) or regenerated into whole plants. Transgenicplant cells also can include algae engineered to produce proteins (see,for example, Mayfield et al. (2003) Proc Natl Acad Sci USA 100:438-442).Transformed plants include, for example, plants selected from the generaNicotiana, Solanum, Sorghum, Arabidopsis, Medicago (alfalfa), Gossypium(cotton) and Brassica (rape). In some examples, the plant belongs to thespecies of Nicotiana tabacum, and is transformed with vectors thatoverexpress a cytochrome P450 and/or a cytochrome P450 reductase, suchas described in U.S. Pat. Pub. No. 20090123984 and U.S. Pat. No.7,906,710.

d. Insects and Insect Cells

Insects and insect cells, particularly a baculovirus expression system,can be used for expressing cytochrome P450s and cytochrome P450reductases, including cytochrome P450 santalene oxidase polypeptides,cytochrome P450 bergamotene oxidase polypeptides and cytochrome P450reductase polypeptides provided herein (see, for example, Muneta et al.(2003) J. Vet. Med. Sci. 65(2):219-223). Insect cells and insect larvae,including expression in the haemolymph, express high levels of proteinand are capable of most of the post-translational modifications used byhigher eukaryotes. Baculoviruses have a restrictive host range whichimproves the safety and reduces regulatory concerns of eukaryoticexpression. Typically, expression vectors use a promoter such as thepolyhedrin promoter of baculovirus for high level expression. Commonlyused baculovirus systems include baculoviruses such as Autographacalifornica nuclear polyhedrosis virus (AcNPV), and the Bombyx morinuclear polyhedrosis virus (BmNPV) and an insect cell line such as Sf9derived from Spodoptera frugiperda (see, e.g., Mizutani and Ohta (1998)Plant Physiology 116:357-367), Pseudaletia unipuncta (A7S) and Danausplexippus (DpN1). For high level expression, the nucleotide sequence ofthe molecule to be expressed is fused immediately downstream of thepolyhedrin initiation codon of the virus. Mammalian secretion signalsare accurately processed in insect cells and can be used to secrete theexpressed protein into the culture medium. In addition, the cell linesPseudaletia unipuncta (A7S) and Danaus plexippus (DpN1) produce proteinswith glycosylation patterns similar to mammalian cell systems.

An alternative expression system in insect cells is the use of stablytransformed cells. Cell lines such as the Schnieder 2 (S2) and Kc cells(Drosophila melanogaster) and C7 cells (Aedes albopictus) can be usedfor expression. The Drosophila metallothionein promoter can be used toinduce high levels of expression in the presence of heavy metalinduction with cadmium or copper. Expression vectors are typicallymaintained by the use of selectable markers such as neomycin andhygromycin.

e. Mammalian Expression

Mammalian expression systems can be used to express cytochrome P450s andcytochrome P450 reductases, including cytochrome P450 santalene oxidasepolypeptides, cytochrome P450 bergamotene oxidase polypeptides andcytochrome P450 reductase polypeptides provided herein and also can beused to produce terpenes whose reactions are catalyzed by the synthases.Expression constructs can be transferred to mammalian cells by viralinfection such as adenovirus or by direct DNA transfer such asliposomes, calcium phosphate, DEAE-dextran and by physical means such aselectroporation and microinjection. Expression vectors for mammaliancells typically include an mRNA cap site, a TATA box, a translationalinitiation sequence (Kozak consensus sequence) and polyadenylationelements. Such vectors often include transcriptional promoter-enhancersfor high level expression, for example the SV40 promoter-enhancer, thehuman cytomegalovirus (CMV) promoter, and the long terminal repeat ofRous sarcoma virus (RSV). These promoter-enhancers are active in manycell types. Tissue and cell-type promoters and enhancer regions also canbe used for expression. Exemplary promoter/enhancer regions include, butare not limited to, those from genes such as elastase I, insulin,immunoglobulin, mouse mammary tumor virus, albumin, alpha-fetoprotein,alpha 1-antitrypsin, beta-globin, myelin basic protein, myosin lightchain-2 and gonadotropic releasing hormone gene control. Selectablemarkers can be used to select for and maintain cells with the expressionconstruct. Examples of selectable marker genes include, but are notlimited to, hygromycin B phosphotransferase, adenosine deaminase,xanthine-guanine phosphoribosyl transferase, aminoglycosidephosphotransferase, dihydrofolate reductase and thymidine kinase. Fusionwith cell surface signaling molecules such as TCR-ζ and Fc_(ε)RI-γ candirect expression of the proteins in an active state on the cellsurface.

Many cell lines are available for mammalian expression including mouse,rat human, monkey, and chicken and hamster cells. Exemplary cell linesinclude, but are not limited to, BHK (i.e. BHK-21 cells), 293-F, CHO,CHO Express (CHOX; Excellgene), Balb/3T3, HeLa, MT2, mouse NSO(non-secreting) and other myeloma cell lines, hybridoma andheterohybridoma cell lines, lymphocytes, fibroblasts, Sp2/0, COS,NIH3T3, HEK293, 293S, 293T, 2B8, and HKB cells. Cell lines also areavailable adapted to serum-free media which facilitates purification ofsecreted proteins from the cell culture media. One such example is theserum free EBNA-1 cell line (Pham et al. (2003) Biotechnol. Bioeng.84:332-42).

f. Exemplary Host Cells

Exemplary host cells for expression of a cytochrome p450 polypeptideprovided herein, such as a cytochrome P450 santalene oxidase, cytochromeP450 bergamotene oxidase or cytochrome P450 reductase, includeprokaryotic and eukaryotic cells. Typically, the host cell produces anacyclic pyrophosphate terpene precursor. For example, the host cellproduces farnesyl diphosphate. In some examples, the host cell can be acell line that produces FPP as part of the mevalonate-dependentisoprenoid biosynthetic pathway (e.g. fungi, including yeast cells, andanimal cells) or the mevalonate-independent isoprenoid biosyntheticpathway (e.g. bacteria and higher plants). In some examples, the hostcell produces farnesyl diphosphate natively. In other examples, the hostcell is modified to produce more farnesyl diphosphate compared to anunmodified cell. Exemplary host cells include bacteria, yeast, insect,plant and mammalian cells. In particular examples, the host cell is ayeast cell. For example, the yeast cell is a Saccharomyces genus cell,such as a Saccharomyces cerevisiae cell. In another example, the yeastcell is a Pichia genus cell, such as a Pichia pastoris cell. In otherparticular examples, the host cell is an Escherichia coli cell.

In particular examples, the host cell has been modified to overproduceFPP. Exemplary of such cells are modified yeast cells. For example,yeast cells that have been modified to produce less squalene synthase orless active squalene synthase (e.g. erg9 mutants; see e.g. U.S. Pat.Nos. 6,531,303 and 6,689,593) are useful in the methods provided hereinto produce labdenediol diphosphate. Reduced squalene synthase activityresults in accumulation of FPP in the host cell at higher levelscompared to wild type yeast cells. Exemplary modified yeast cellsinclude, but are not limited to, modified Saccharomyces cerevisiaestrains YPH499 (MATa, ura3-52, lys2-801, ade2-101, trp1-Δ63, his3-Δ200,leu2-Δ1), WAT11 (MATa, ade2-1, his3-11,-15; leu2-3,-112, ura3-1, canR,cyr+; containing chromosomally integrated Arabidopsis NADPH-dependentP450 reductase ATR1; see Pompon et al. (1995) Toxicol Lett82-83:815-822; Ro et al. (2005) Proc Natl Acad Sci USA 102:8060-8065);and BY4741 (MATa, his3Δ1, leu2Δ0, met15Δ0, ura3Δ0; ATCC #201388). Theuse of such host cells for expression of a cytochrome P450 polypeptideprovided herein allows for increased yields of the precursor FPP andthus allows for increased yields of santalenes and bergamotenes.

Provided herein are host cells containing any cytochrome P450polypeptide or catalytically active fragment thereof provided herein.Provided herein are host cells containing a cytochrome P450 polypeptideor a catalytically active fragment thereof. In some examples, the hostcell contains a cytochrome P450 polypeptide or catalytically activefragment thereof has a sequence of nucleotides set forth in any of SEQID NOS:1-5 and 67-72. In other examples, the host cell contains acytochrome P450 polypeptide or catalytically active fragment thereof hasa sequence of nucleic acids that has at least 60%, 65%, 70%, 75%, 80%,85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% percentsequence identity to a sequence of nucleotides set forth in any of SEQID NOS:1-5 and 67-72. In other examples, the host cell contains nucleicacid encoding a cytochrome P450 polypeptide or catalytically activefragment thereof that has a sequence of amino acids set forth in any ofSEQ ID NOS:6-9, 50 and 73-78. In yet other examples, the host cellcontains nucleic acid encoding a cytochrome P450 polypeptide orcatalytically active fragment thereof that has at least 60%, 65%, 70%,75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99%percent sequence identity to a sequence of amino acids set forth in anyof SEQ ID NOS:6-9, 50 and 73-78.

Provided herein are host cells containing a cytochrome P450 santaleneoxidase or a catalytically active fragment thereof. In some examples,the host cell contains a cytochrome P450 santalene oxidase orcatalytically active fragment thereof has a sequence of nucleotides setforth in any of SEQ ID NOS:3, 68, 69, 70 or 71. In other examples, thehost cell contains a cytochrome P450 santalene oxidase or catalyticallyactive fragment thereof has a sequence of nucleic acids that has atleast 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%,97%, 98%, or 99% percent sequence identity to a sequence of nucleotidesset forth in any of SEQ ID NOS:3, 68, 69, 70 or 71. In other examples,the host cell contains nucleic acid encoding a cytochrome P450 santaleneoxidase or catalytically active fragment thereof that has a sequence ofamino acids set forth in any of SEQ ID NOS:7, 74, 75, 76 or 77. In yetother examples, the host cell contains nucleic acid encoding acytochrome P450 santalene oxidase or catalytically active fragmentthereof that has at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%,93%, 94%, 95%, 96%, 97%, 98%, or 99% percent sequence identity to asequence of amino acids set forth in any of SEQ ID NOS:7, 74, 75, 76 or77.

Provided herein are host cells containing a cytochrome P450 bergamoteneoxidase or a catalytically active fragment thereof. In some examples,the host cell contains a cytochrome P450 bergamotene oxidase orcatalytically active fragment thereof has a sequence of nucleotides setforth in any of SEQ ID NOS:2, 4, 5 or 67. In other examples, the hostcell contains a cytochrome P450 bergamotene oxidase or catalyticallyactive fragment thereof has a sequence of nucleic acids that has atleast 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%,97%, 98%, or 99% percent sequence identity to a sequence of nucleotidesset forth in any of SEQ ID NOS:2, 4, 5 or 67. In other examples, thehost cell contains nucleic acid encoding a cytochrome P450 bergamoteneoxidase or catalytically active fragment thereof that has a sequence ofamino acids set forth in any of SEQ ID NOS:6, 8, 9 or 73. In yet otherexamples, the host cell contains nucleic acid encoding a cytochrome P450bergamotene oxidase or catalytically active fragment thereof that has atleast 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%,97%, 98%, or 99% percent sequence identity to a sequence of amino acidsset forth in any of SEQ ID NOS:6, 8, 9 or 73.

In some examples, any of the host cells provided herein containing acytochrome P450 or catalytically active fragment thereof can furthercontain a terpene synthase. Provided herein are host cells containing acytochrome P450 or catalytically active fragment thereof and a terpenesynthase. In such examples, the terpene synthase can be a santalenesynthase. For example, the terpene synthase is a santalene synthasehaving a sequence of amino acids set forth in any of SEQ ID NOS:17, 52or 53, or a santalene synthase having at least 80%, 85%, 86%, 87%, 88%,89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% sequence identityto a sequence of amino acids set forth in any of SEQ ID NOS:17, 52 or53, or a nucleic acid molecule encoding a santalene synthase. Theencoding nucleic acid molecule has a sequence of nucleotides set forthin any of SEQ ID NOS:16, 59 or 60, or a nucleic acid molecule encoding asantalene synthase. The nucleic acid molecule has at least 80%, 85%,86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%percent identity to a sequence of nucleotides set forth in any of SEQ IDNOS:16, 59 or 60.

Provided herein are host cells containing a cytochrome P450 orcatalytically active fragment thereof and a santalene synthase having asequence of amino acids set forth in any of SEQ ID NOS:17, 52 or 53, ora santalene synthase having at least 80%, 85%, 86%, 87%, 88%, 89%, 90%,91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% sequence identity to asequence of amino acids set forth in any of SEQ ID NOS:17, 52 or 53, ora nucleic acid molecule encoding a santalene synthase. The nucleic acidmolecule has a sequence of nucleotides set forth in any of SEQ IDNOS:16, 59 or 60, or a nucleic acid molecule encoding a santalenesynthase. The nucleic acid molecule has at least 80%, 85%, 86%, 87%,88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% percentidentity to a sequence of nucleotides set forth in any of SEQ ID NOS:16,59 or 60. In such examples, the cytochrome P450 or catalytically activefragment thereof is a cytochrome P450 polypeptide or catalyticallyactive fragment thereof has a sequence of nucleotides set forth in anyof SEQ ID NOS:1-5 and 67-72, or a cytochrome P450 polypeptide orcatalytically active fragment thereof has a sequence of nucleic acidsthat has at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%,95%, 96%, 97%, 98%, or 99% percent sequence identity to a sequence ofnucleotides set forth in any of SEQ ID NOS:1-5 and 67-72, or a nucleicacid molecule encoding a cytochrome P450 polypeptide or catalyticallyactive fragment thereof that has a sequence of amino acids set forth inany of SEQ ID NOS:6-9, 50 and 73-78, or a nucleic acid molecule encodinga cytochrome P450 polypeptide or catalytically active fragment thereofthat has at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%,95%, 96%, 97%, 98%, or 99% percent sequence identity to a sequence ofamino acids set forth in any of SEQ ID NOS:6-9, 50 and 73-78.

In one example, provided herein is a host cell that contains acytochrome P450 polypeptide or catalytically active fragment thereof anda santalene synthase. In another example, provided herein is a host cellthat contains a cytochrome P450 santalene oxidase or catalyticallyactive fragment thereof and a santalene synthase. In yet anotherexample, provided herein is a host cell that contains a cytochrome P450bergamotene oxidase or catalytically active fragment thereof and asantalene synthase. Also provided herein are host cells containing acytochrome P450 or catalytically active fragment thereof and a terpenesynthase that further contain a cytochrome P450 reductase orcatalytically active fragment thereof. In such examples, the terpenesynthase can be a santalene synthase. For example, the terpene synthaseis a santalene synthase having a sequence of amino acids set forth inany of SEQ ID NOS:17, 52 or 53, or a santalene synthase having at least80%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%,98%, 99% sequence identity to a sequence of amino acids set forth in anyof SEQ ID NOS:17, 52 or 53, or a nucleic acid molecule encoding asantalene synthase. The nucleic acid molecule has a sequence ofnucleotides set forth in any of SEQ ID NOS:16, 59 or 60, or a nucleicacid molecule encoding a santalene synthase. The nucleic acid moleculehas at least 80%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%,96%, 97%, 98%, 99% percent identity to a sequence of nucleotides setforth in any of SEQ ID NOS:16, 59 or 60. In such examples, thecytochrome P450 reductase or catalytically active fragment thereof is acytochrome P450 reductase or catalytically active fragment thereof has asequence of nucleotides set forth in any of SEQ ID NOS:10 or 11, or acytochrome P450 reductase or catalytically active fragment thereof has asequence of nucleic acids that has at least 60%, 65%, 70%, 75%, 80%,85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% percentsequence identity to a sequence of nucleotides set forth in any of SEQID NOS:10 or 11, ora nucleic acid molecule encoding a cytochrome P450reductase or catalytically active fragment thereof that has a sequenceof amino acids set forth in any of SEQ ID NOS:12-15 or a nucleic acidmolecule encoding a cytochrome P450 reductase or catalytically activefragment thereof that has at least 60%, 65%, 70%, 75%, 80%, 85%, 90%,91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% percent sequence identityto a sequence of amino acids set forth in any of SEQ ID NOS:12-15. Insuch examples, the cytochrome P450 or catalytically active fragmentthereof is a cytochrome P450 polypeptide or catalytically activefragment thereof has a sequence of nucleotides set forth in any of SEQID NOS:1-5 and 67-72, or a cytochrome P450 polypeptide or catalyticallyactive fragment thereof has a sequence of nucleic acids that has atleast 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%,97%, 98%, or 99% percent sequence identity to a sequence of nucleotidesset forth in any of SEQ ID NOS:1-5 and 67-72, or a nucleic acid moleculeencoding a cytochrome P450 polypeptide or catalytically active fragmentthereof that has a sequence of amino acids set forth in any of SEQ IDNOS:6-9, 50 and 73-78, or a nucleic acid molecule encoding a cytochromeP450 polypeptide or catalytically active fragment thereof that has atleast 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%,97%, 98%, or 99% percent sequence identity to a sequence of amino acidsset forth in any of SEQ ID NOS:6-9, 50 and 73-78.

In one example, provided herein is a host cell containing a cytochromeP450 polypeptide or catalytically active fragment thereof, a santalenesynthase and a cytochrome P450 reductase or catalytically activefragment thereof. In another example, provided herein is a host cellcontaining a cytochrome P450 santalene oxidase or catalytically activefragment thereof, a santalene synthase and a cytochrome P450 reductaseor catalytically active fragment thereof. In yet another example,provided herein is a host cell containing a cytochrome P450 bergamoteneoxidase or catalytically active fragment thereof, a santalene synthaseand a cytochrome P450 reductase or catalytically active fragmentthereof.

Provided herein are host cells containing a cytochrome P450 reductase ora catalytically active fragment thereof. In some examples, the host cellcontains a cytochrome P450 reductase or catalytically active fragmentthereof has a sequence of nucleotides set forth in any of SEQ ID NOS:10or 11. In other examples, the host cell contains a cytochrome P450reductase or catalytically active fragment thereof has a sequence ofnucleic acids that has at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%,92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% percent sequence identity to asequence of nucleotides set forth in any of SEQ ID NOS:10 or 11. Inother examples, the host cell contains nucleic acid encoding acytochrome P450 reductase or catalytically active fragment thereof thathas a sequence of amino acids set forth in any of SEQ ID NOS:12-15. Inyet other examples, the host cell contains nucleic acid encoding acytochrome P450 reductase or catalytically active fragment thereof thathas at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%,96%, 97%, 98%, or 99% percent sequence identity to a sequence of aminoacids set forth in any of SEQ ID NOS:12-15.

In some examples, the host cell containing a cytochrome P450 reductaseor catalytically active fragment thereof further contains a cytochromeP450 or catalytically active fragment thereof. For example, providedherein are host cells containing a cytochrome P450 reductase or acatalytically active fragment thereof and a cytochrome P450 orcatalytically active fragment thereof. In such examples, the cytochromeP450 or catalytically active fragment thereof is a cytochrome P450polypeptide or catalytically active fragment thereof has a sequence ofnucleotides set forth in any of SEQ ID NOS:1-5 and 67-72, or acytochrome P450 polypeptide or catalytically active fragment thereof hasa sequence of nucleic acids that has at least 60%, 65%, 70%, 75%, 80%,85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% percentsequence identity to a sequence of nucleotides set forth in any of SEQID NOS:1-5 and 67-72, or a nucleic acid molecule encoding a cytochromeP450 polypeptide or catalytically active fragment thereof that has asequence of amino acids set forth in any of SEQ ID NOS:6-9, 50 and73-78, or a nucleic acid molecule encoding a cytochrome P450 polypeptideor catalytically active fragment thereof that has at least 60%, 65%,70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99%percent sequence identity to a sequence of amino acids set forth in anyof SEQ ID NOS:6-9, 50 and 73-78.

In one example, provided herein is a host cell containing a cytochromeP450 polypeptide or catalytically active fragment thereof and acytochrome P450 reductase or catalytically active fragment thereof. Inanother example, provided herein is a host cell containing a cytochromeP450 santalene oxidase or catalytically active fragment thereof and acytochrome P450 reductase or catalytically active fragment thereof. Inyet another example, provided herein is a host cell containing acytochrome P450 bergamotene oxidase or catalytically active fragmentthereof and a cytochrome P450 reductase or catalytically active fragmentthereof.

5. Purification

Methods for purification of cytochrome P450s and cytochrome P450reductases, such as cytochrome P450 santalene oxidase polypeptides,cytochrome P450 bergamotene oxidase polypeptides and cytochrome P450reductase polypeptides, from host cells depend on the chosen host cellsand expression systems. For secreted molecules, proteins are generallypurified from the culture media after removing the cells. Forintracellular expression, cells can be lysed and the proteins purifiedfrom the extract. When transgenic organisms such as transgenic plantsand animals are used for expression, tissues or organs can be used asstarting material to make a lysed cell extract. Additionally, transgenicanimal production can include the production of polypeptides in milk oreggs, which can be collected, and if necessary the proteins can beextracted and further purified using standard methods in the art.

Cytochrome P450s and cytochrome P450 reductases, including cytochromeP450 santalene oxidase polypeptides, cytochrome P450 bergamotene oxidasepolypeptides and cytochrome P450 reductase polypeptides, can be purifiedusing standard protein purification techniques known in the artincluding but not limited to, SDS-PAGE, size fraction and size exclusionchromatography, ammonium sulfate precipitation, chelate chromatographyand ionic exchange chromatography. Expression constructs also can beengineered to add an affinity tag such as a myc epitope, GST fusion orHis₆ and affinity purified with myc antibody, glutathione resin, andNi-resin, respectively, to a protein. Purity can be assessed by anymethod known in the art including gel electrophoresis and staining andspectrophotometric techniques.

6. Fusion Proteins

Fusion proteins containing a cytochrome P450s and cytochrome P450reductases, including cytochrome P450 santalene oxidase polypeptides,cytochrome P450 bergamotene oxidase polypeptides and cytochrome P450reductase polypeptides, and one or more other polypeptides also areprovided. Linkage of a cytochrome P450 or cytochrome P450 reductasepolypeptide with another polypeptide can be effected directly orindirectly via a linker. In one example, linkage can be by chemicallinkage, such as via heterobifunctional agents or thiol linkages orother such linkages. Fusion also can be effected by recombinant means.Fusion of a cytochrome P450 or cytochrome P450 reductase, such as acytochrome P450 santalene oxidase polypeptide, cytochrome P450bergamotene oxidase polypeptide and cytochrome P450 reductasepolypeptide, to another polypeptide can be to the N- or C- terminus ofthe cytochrome P450 santalene oxidase polypeptide, cytochrome P450bergamotene oxidase polypeptide and cytochrome P450 reductasepolypeptide.

A fusion protein can be produced by standard recombinant techniques. Forexample, DNA fragments coding for the different polypeptide sequencescan be ligated together in-frame in accordance with conventionaltechniques, e.g., by employing blunt-ended or stagger-ended termini forligation, restriction enzyme digestion to provide for appropriatetermini, filling-in of cohesive ends as appropriate, alkalinephosphatase treatment to avoid undesirable joining, and enzymaticligation. In another embodiment, the fusion gene can be synthesized byconventional techniques including automated DNA synthesizers.Alternatively, PCR amplification of gene fragments can be carried outusing anchor primers that give rise to complementary overhangs betweentwo consecutive gene fragments that can subsequently be annealed andreamplified to generate a chimeric gene sequence (see, e.g., Ausubel etal. (eds.) Current Protocols in Molecular Biology, John Wiley & Sons,1992). Moreover, many expression vectors are commercially available thatalready encode a fusion moiety (e.g., a GST polypeptide). A cytochromeP450 santalene oxidase polypeptide-encoding nucleic acid can be clonedinto such an expression vector such that the fusion moiety is linkedin-frame to the cytochrome P450 santalene oxidase protein. A cytochromeP450 bergamotene oxidase polypeptide-encoding nucleic acid can be clonedinto such an expression vector such that the fusion moiety is linkedin-frame to the cytochrome P450 bergamotene oxidase protein. In someexamples, a cytochrome P450 polypeptide-encoding nucleic acid can becloned into such an expression vector such that the cytochrome P450 islinked in frame to a santalene synthase polypeptide-encoding nucleicacid. For example, a cytochrome P450 santalene oxidase or bergamoteneoxidase polypeptide-encoding nucleic acid can be cloned into such anexpression vector such that the cytochrome P450 santalene oxidase orbergamotene oxidase is linked in frame to a santalene synthasepolypeptide-encoding nucleic acid. The cytochrome P450 and santalenesynthases can be linked directly, without a linker, or alternatively,linked indirectly in-frame with a linker.

G. METHODS FOR PRODUCING TERPENOIDS AND METHODS FOR DETECTING SUCHPRODUCTS AND THE ACTIVITY OF THE CYTOCHROME P450 AND CYTOCHROME P450REDUCTASE POLYPEPTIDES

The cytochrome P450 polypeptides provided herein can be used to, andassessed for their ability to, produce terpenoids, includingmonoterpenoids, sesquiterpenoids and diterpenoids, from any suitableterpene substrate, including monoterpenes, sesquiterpenes andditerpenes. Typically, the cytochrome P450 santalene oxidases providedherein produce santalols from santalenes and the cytochrome P450bergamotene oxidases provided herein produce bergamotols frombergamotenes. Any method known to one of skill in the art can be used toproduce terpenoids catalyzed by the cytochrome P450 polypeptidesprovided herein. The ability of the cytochrome P450 polypeptidesprovided herein to catalyze the formation of terpenoids from terpenesubstrates can be assessed using these methods. Terpenoid productsanalyzed by GC-MS and can be identified based on matches of the MSfragmentation patterns with entries in the NIST and Wiley libraries (forexample, as described in Example 6 below).

The cytochrome P450 reductase polypeptides provided herein can be usedto, and assessed for their ability to, transfer two electrons from NADPHto any suitable electron receptor, including cytochrome P450s,cytochrome c, heme oxygenases, cytochrome b₅ and squalene epoxidases.

Other activities and properties of the cytochrome P450 and cytochromeP450 reductase polypeptides, such as the cytochrome P450 santaleneoxidases, cytochrome P450 bergamotene oxidases and cytochrome P450reductases provided herein, also can be assessed using methods andassays well known in the art. In addition to assessing the activity ofthe cytochrome P450 and cytochrome P450 reductase polypeptides and theirability to catalyze the formation of terpenoids, the kinetics of thereaction, increased substrate specificity, altered substrate utilizationand/or altered product distribution (as compared to another cytochromeP450 and cytochrome P450 reductase polypeptide) can be assessed usingmethods well known in the art. For example, the amount and type ofterpenoids produced from santalenes or bergamotenes by the santaleneoxidase and bergamotene oxidase polypeptides provided herein can beassessed by gas chromatography methods (e.g. GC-MS), such as thosedescribed in Example 6, and compared to the MS fragmentation patternswith entries in the NIST and Wiley libraries (see Example 6). Productscan also be identified by comparison with compounds of authenticsandalwood oil.

Provided below are methods for the production of santalols, including(Z)-α-santalol, (E)-α-santalol, (Z)-β-santalol, (E)-β-santalol,(Z)-epi-β-santalol and (E)-epi-β-santalol, and (E)-α-trans-bergamotoland (Z)-α-trans-bergamotol, where production of the santalols andbergamotols is catalyzed by the cytochrome P450 and cytochrome P450reductase polypeptides provided herein. Also provided herein are methodsfor assessing the activity of the cytochrome P450 and cytochrome P450reductase polypeptides provided herein.

1. Synthesis of Santalols and Bergamotols

The cytochrome P450 santalene oxidase and cytochrome P450 bergamoteneoxidase polypeptides provided herein can be used to catalyze theformation of santalols and bergamotols from the terpene substratessantalenes and bergamotenes. In some examples, the cytochrome P450santalene oxidases are expressed in cells that produce or overexpress asantalene synthase and FPP, such that santalols are produced asdescribed elsewhere herein. In other examples, the cytochrome P450bergamotene oxidases are expressed in cells that produce of overexpressa santalene synthase, such that bergamotols are produced as describedelsewhere herein. In other examples, the cytochrome P450 santaleneoxidase and cytochrome P450 bergamotene oxidase polypeptides providedherein are expressed and purified form any suitable host cells, such asany described in Section E. The purified cytochrome P450 santaleneoxidase and cytochrome P450 bergamotene oxidase polypeptides are thencombined in vitro with santalenes and bergamotenes to produce santalolsand bergamotols.

a. Oxidation of Santalenes and Bergamotenes

In some examples, the cytochrome P450 santalene oxidase polypeptidesprovided herein are overexpressed and purified as described in Section Eabove. The cytochrome P450 santalene oxidase is then incubated with oneor more terpene substrates, including α-santalene, β-santalene,epi-β-santalene and/or α-trans-bergamotene, and one or more ofα-santalol, β-santalol and epi-β-santalol, and α-trans-bergamotol, suchas (E)-α-santalol, (Z)-α-santalol, (E)-β-santalol, (Z)-β-santalol,(E)-epi-β-santalol, (Z)-epi-β-santalol, (Z)-α-trans-bergamotol and(E)-α-trans-bergamotol, are produced. Alternatively, the cytochrome P450santalene oxidase polypeptides provided herein expressed in host cellsthat also produce terpene substrates, including α-santalene,β-santalene, epi-β-santalene and/or α-trans-bergamotene, resulting inthe production of one or more of α-santalol, β-santalol andepi-β-santalol, and α-trans-bergamotol, such as (E)-α-santalol,(Z)-α-santalol, (E)-β-santalol, (Z)-β-santalol, (E)-epi-β-santalol,(Z)-epi-β-santalol, (Z)-α-trans-bergamotol and (E)-α-trans-bergamotol.Production of santalols and bergamotols and quantification of the amountof product are then determined using any method provided herein, such asgas chromatography-mass spectroscopy (e.g. GC-MS), gaschromatography-flame ionization detection (GC-FID) and liquidchromatography-mass spectroscopy (LC-MS). Mass spectrometry patterns canbe compared to the MS fragmentation patterns with entries in the NISTand Wiley libraries, such as described in Example 6, or by comparisonwith known terpenoids in sandalwood oil.

In other examples, the cytochrome P450 bergamotene oxidase polypeptidesprovided herein are overexpressed and purified as described in Section Eabove. The cytochrome P450 bergamotene oxidase is then incubated withone or more terpene substrates, including α-santalene, β-santalene,epi-β-santalene and/or α-trans-bergamotene, and one or more of(E)-α-trans-bergamotol or (Z)-α-trans-bergamotol is produced. In someexamples, small amounts of α-santalol, β-santalol and/or epi-β-santalolare also produced. Alternatively, the cytochrome P450 bergamoteneoxidase polypeptides provided herein expressed in host cells that alsoproduce terpene substrates, including α-santalene, β-santalene,epi-β-santalene and/or α-trans-bergamotene, resulting in the productionof (E)-α-trans-bergamotol or (Z)-α-trans-bergamotol. In some examples,small amounts of α-santalol, β-santalol and/or epi-β-santalol are alsoproduced. Production of bergamotols and quantification of the amount ofproduct are then determined using any method provided herein, such asgas chromatography-mass spectroscopy (e.g. GC-MS), gaschromatography-flame ionization detection (GC-FID) and liquidchromatography-mass spectroscopy (LC-MS). Mass spectrometry patterns canbe compared to the MS fragmentation patterns with entries in the NISTand Wiley libraries, such as described in Example 6, or by comparisonwith known terpenoids in sandalwood oil.

b. Conversion of Acyclic Pyrophosphate Terpene Precursors

In some examples, terpenoids can be generated biosynthetically fromacyclic pyrophosphate terpene precursors, such as geranyl pyrophosphate,farnesyl pyrophosphate and geranylgeranyl pyrophosphate, by expressionof a cytochrome P450 monooxygenase in a host cell that produces theacyclic pyrophosphate terpene precursor and a terpene synthase. Suitablehost cells are described in Section E above. In one example, santalolsand bergamotols are generated biosynthetically by expression of acytochrome P450 santalene oxidase in a host cell that produces FPP andsantalene synthase (see Example 10). In another example, bergamotols aregenerated biosynthetically by expression of a cytochrome P450bergamotene oxidase in a host cell that produces FPP and santalenesynthase (see Example 10). Production of santalols and bergamotols andquantification of the amount of products are then determined using anymethod provided herein, such as gas chromatography-mass spectroscopy(e.g. GC-MS), gas chromatography-flame ionization detection (GC-FID) andliquid chromatography-mass spectroscopy (LC-MS). Mass spectrometrypatterns can be compared to the MS fragmentation patterns with entriesin the NIST and Wiley libraries, such as described in Example 6, or bycomparison with known terpenoids in sandalwood oil.

In another example, terpenoids can be generated from acyclicpyrophosphate terpene precursors by 1) incubating an acyclicpyrophosphate terpene precursor with a terpene synthase and 2)incubating the reaction products with a cytochrome P450 monooxygenase.In some examples, the reaction products of the acyclic pyrophosphateterpene precursor with the terpene synthase are isolated. In otherexamples, the cytochrome P450 monooxygenase is added directly to thefirst reaction mixture without previous purification. The two steps canbe performed simultaneously or sequentially. Terpenoids produced by thereaction can be identified and quantified using any method providedherein, such as gas chromatography-mass spectroscopy (e.g. GC-MS), gaschromatography-flame ionization detection (GC-FID) and liquidchromatography-mass spectroscopy (LC-MS). Mass spectrometry patterns canbe compared to the MS fragmentation patterns with entries in the NISTand Wiley libraries, such as described in Example 6, or by comparisonwith known terpenoids in sandalwood oil.

2. Methods for Production

a. Exemplary Cells

Santalols and bergamotols can be produced by expressing a cytochromeP450 synthase polypeptide and/or a cytochrome P450 reductase polypeptideprovided herein in a cell line that produces FPP as part of themevalonate-dependent isoprenoid biosynthetic pathway (e.g. fungi,including yeast cells, and animal cells) or the mevalonate-independentisoprenoid biosynthetic pathway (e.g. bacteria and higher plants). Inparticular examples, santalols are produced by expressing a cytochromeP450 santalene oxidase polypeptide provided herein and a santalenesynthase polypeptide in a cell line that has been modified tooverproduce FPP. In other examples, bergamotols are produced byexpressing a cytochrome P450 bergamotene oxidase polypeptide providedherein and a santalene synthase polypeptide in a cell line that has beenmodified to overproduce FPP. Exemplary of such cells are modified yeastcells. For example, yeast cells that have been modified to produce lesssqualene synthase or less active squalene synthase (e.g. erg9 mutants;see e.g. U.S. Pat. Nos. 6,531,303 and 6,689,593) are useful in themethods provided herein to produce labdenediol diphosphate. Reducedsqualene synthase activity results in accumulation of FPP in the hostcell at higher levels compared to wild type yeast cells, thus allowingfor increased yields of santalenes and bergamotenes. Exemplary modifiedyeast cells include, but are not limited to, modified Saccharomycescerevisiae strains YPH499 (MATa, ura3-52, lys2-801, ade2-101, trp1-Δ63,his3-Δ200, leu2-Δ1), WAT11 (MATa, ade2-1, his3-11,-15; leu2-3,-112,ura3-1, canR, cyr+; containing chromosomally integrated ArabidopsisNADPH-dependent P450 reductase ATR1; see Pompon et al. (1995) ToxicolLett 82-83:815-822; Ro et al. (2005) Proc Natl Acad Sci USA102:8060-8065); and BY4741 (MATa, his3Δ1, leu2Δ0, met15Δ0, ura3Δ0; ATCC#201388).

b. Culture of Cells

In exemplary methods, a cytochrome P450 provided herein is expressed ina host cell line that has been modified to overexpress farnesyldiphosphate and a santalene synthase, whereby upon expression of thecytochrome P450, farnesyl diphosphate is converted to santalols andbergamotols. In other exemplary methods, a cytochrome P450 providedherein and a santalene synthase are expressed in a host cell line thathas been modified to overexpress farnesyl diphosphate whereby uponexpression of both proteins, farnesyl diphosphate is converted tosantalols or bergamotols. The cytochrome P450 and santalene synthase canbe expressed separately, or together, as a fusion protein describedelsewhere herein. cytochrome P450 and santalene synthase can beexpressed simultaneously or sequentially. The host cell is culturedusing any suitable method well known in the art. In some examples, suchas for high throughput screening of cell expressing various cytochromeP450s, the cells expressing the cytochrome P450 are cultured inindividual wells of a 96-well plate. In other examples where the hostcell is yeast, the cell expressing the cytochrome P450 polypeptides,santalene synthase and FPP is cultured using fermentation methods suchas those described below.

A variety of fermentation methodologies can be used for the productionof santalols and bergamotols from yeast cells expressing the cytochromeP450 polypeptides provided herein. For example, large scale productioncan be effected by either batch or continuous fermentation. A classicalbatch fermentation is a closed system where the composition of themedium is set at the beginning of the fermentation and not subject toartificial alterations during the fermentation. Thus, at the beginningof the fermentation the medium is inoculated with the desiredmicroorganism or microorganisms and fermentation is permitted to occurwithout further addition of nutrients. Typically, the concentration ofthe carbon source in a batch fermentation is limited, and factors suchas pH and oxygen concentration are controlled. In batch systems themetabolite and biomass compositions of the system change constantly upto the time the fermentation is stopped. Within batch cultures cellstypically modulate through a static lag phase to a high growth log phaseand finally to a stationary phase where growth rate is diminished orhalted. If untreated, cells in the stationary phase will eventually die.

A variation on the standard batch system is the Fed-Batch system, whichis similar to a typical batch system with the exception that nutrientsare added as the fermentation progresses. Fed-Batch systems are usefulwhen catabolite repression tends to inhibit the metabolism of the cellsand where it is desirable to have limited amounts of substrate in themedium. Also, the ability to feed nutrients will often result in highercell densities in Fed-Batch fermentation processes compared to Batchfermentation processes. Factors such as pH, dissolved oxygen, nutrientconcentrations, and the partial pressure of waste gases such as CO aregenerally measured and controlled in Fed-Batch fermentations.

Production of the santalols or bergamotols also can be accomplished withcontinuous fermentation. Continuous fermentation is an open system wherea defined fermentation medium is added continuously to a bioreactor andan equal amount of conditioned medium is removed simultaneously forprocessing. This system generally maintains the cultures at a constanthigh density where cells are primarily in their log phase of growth.Continuous fermentation allows for modulation of any number of factorsthat affect cell growth or end product concentration. For example, onemethod will maintain a limiting nutrient such as the carbon source ornitrogen level at a fixed rate and allow all other parameters tomoderate. In other systems a number of factors affecting growth can bealtered continuously while the cell concentration, measured by themedium turbidity, is kept constant. Continuous systems aim to maintainsteady state growth conditions and thus the cell loss due to the mediumremoval must be balanced against the cell growth rate in thefermentation. Methods of modulating nutrients and growth factors forcontinuous fermentation processes as well as techniques for maximizingthe rate of product formation are well known in the art.

Following cell culture, the cell culture medium then can be harvested toobtain the produced santalols and bergamotols.

c. Isolation and Assays for Detection and Identification

The santalols and bergamotols produced using the methods above with thecytochrome P450 polypeptides provided herein can be isolated andassessed by any method known in the art. In one example, the cellculture medium is extracted with an organic solvent to partition anyterpenes or terpenoids produced into the organic layer. Production ofsantalols and/or bergamotols can be assessed and/or the santalols and/orbergamotols isolated from other products using any method known in theart, such as, for example, gas chromatography or column chromatography.For example, the organic layer can be analyzed by GC-MS.

The quantity of santalols and/or bergamotols produced can be determinedby any known standard chromatographic technique useful for separatingand analyzing organic compounds. For example, santalol and/or bergamotolproduction can be assayed by any known chromatographic technique usefulfor the detection and quantification of hydrocarbons, such as santaloland/or bergamotol and other terpenoids, including, but not limited to,gas chromatography mass spectrometry (GC-MS), gas chromatography using aflame ionization detector (GC-FID), capillary GC-MS, high performanceliquid chromatography (HPLC) and column chromatography. Typically, thesetechniques are carried out in the presence of known internal standardswhich are used to quantify the amount of the terpenoid produced. Forexample, terpenoids, including sesquiterpenoids, such as santalol and/orbergamotol, can be identified by comparison of retention times and massspectra to those of authentic standards in gas chromatography with massspectrometry detection. Typical standards include, but are not limitedto, santalols and/or bergamotols. In other examples, quantification canbe achieved by gas chromatography with flame ionization detection basedupon calibration curves with known amounts of authentic standards andnormalization to the peak area of an internal standard. Thesechromatographic techniques allow for the identification of any terpenepresent in the organic layer, including, for example, other terpenoidsproduced by the cytochrome P450s.

In some examples, kinetics of santalol and/or bergamotol production canbe determined by synthase assays in which radioactive isoprenoidsubstrates, such as ³H FPP or ¹⁴C FPP, are used with varyingconcentrations of synthase. The products are extracted into an organiclayer and radioactivity is measured using a liquid scintillationcounter. Kinetic constants are determined from direct fits of theMichaelis-Menton equation to the data.

3. Production of Sandalwood Oil

The cytochrome P450 santalene oxidase and cytochrome P450 bergamoteneoxidase polypeptides provided herein can be used to produce sandalwoodoil. For example, the cytochrome P450 santalene oxidases can beexpressed in cells that produce or overexpress a santalene synthase,such that santalols and bergamotol, including α-santalol, β-santalol andepi-β-santalol, and Z-α-trans-bergamotol, are produced as describedelsewhere herein. The terpenoid products can be compared to those foundin authentic sandalwood oil from S. album by GC-MS analysis, forexample, as described in Example 8.

4. Assays for Detecting Enzymatic Activity of Cytochrome P450 andCytochrome P450 Reductase Polypeptides

a. Methods for Determining the Activity of Cytochrome P450 Polypeptides

One of skill in the art is familiar with methods and assays to detectthe enzymatic activity of cytochrome P450 polypeptides. Cytochrome P450polypeptides can be expressed in yeast or purified from microsomalmembrane fractions. Cytochrome P450 monooxygenase activity can bedetermined in vitro by incubation of a cytochrome P450 polypeptide withvarious monoterpene, sesquiterpene and diterpene substrates, asdescribed in Example 11. Reaction products, including ratios of theproducts, can be determined by any method known to one of skill in theart, including GC-MS, GC-FID, LC-MS, comparison to known standards, andproton and carbon nuclear magnetic resonance (NMR). Alternatively,activity can be determined in vivo by addition of terpene substrates toyeast cultures of the cytochrome P450s and identifying products asdescribed above. Total P450 content in microsomes can be quantified byCO differential absorption spectroscopy (see Guengerich et al. (2009)Nat Protoc 4:1245-1251 and Example 8).

Enzyme kinetics can be determined in vitro in the presence of NADPH andCPR. In such assays, CPR is included in limited amounts, e.g., 0.1 U,for determination of enzyme activity and 5 milliunits for determinationrelative activities and kinetic parameters. Assays can be performed overa range of substrate concentrations and product formation can bedetermined by GC-MS. Add terpene directly to yeast cultures

b. Methods for Determining the Activity of Cytochrome P450 ReductasePolypeptides

One of skill in the art is familiar with methods and assays to detectthe enzymatic activity of cytochrome P450 reductase polypeptides. In oneexample, CPR activity can be determined using an assay that detects forC4H (cinnamate 4-hydroxylase) activity, for example, as described in Roet al. (2001) Plant Physiology 126:317-329. C4H is a heme-thiolateprotein that catalyzes the formation of p-coumarate from cinnamic acid.This assay can be used in vivo by expression of the cytochrome P450reductase in yeast cells in the presence of C4H (see also, Ro et al.(2002) Plant Physiology 130:1837-1851). C4H activity is determined bydetection of p-coumaric acid formation by HPLC (Mizutani et al. (1993)Plant Cell Physiology 34:481-488).

In order to assess CPR activity in vitro, CPRs can be purified fromyeast microsomal fractions, such as described in Pompon et al. ((1996)Methods Enzymol 272:51-64) and Example 8 below. Total P450 content inmicrosomes can be quantified by CO differential absorption spectroscopy(Omura and Sato (1964) J Biol Chem 239:2370-2378; Mizutani and Ohta(1998) Plant Physiology 116:357-367). FAD and FMN content can bedetermined as described in Faeder and Siegel (1973) Anal Biochem53:332-336. CPR activity in vitro can be assessed by a variety of assaysknown to one of skill in the art. For example, activity can bedetermined using the C4H assay described above. In another example,activity is determined by measuring reduction of an artificial electronreceptor, such as cytochrome c or oxidized ferricyanide (Xia et al.(2011) J Biol Chem 286:16246-16260; Hamdane et al. (2009) J Biol Chem284:11374-11384; Shen et al. (1989) J Biol Chem 264:7584-7589).Formation of reduced cytochrome c is measured using a spectrophotometerand calculating the rate of reduction from A₅₅₀ change using anextinction coefficient (Σ=21 mM⁻¹ cm⁻¹) (Imai (1976) J Biochem80:267-276). Another assay that be used to detect CPR is theethoxycoumarin O-de-ethylase activity reporter assay in P450 2B4reconstituted systems (Louerat-Orieu et al (1998) Eur J Biochem258:1040-1049).

The subcellular membrane localization site, e.g., whether the CPR islocated in the ER or the chloroplast, of a cytochrome P450 reductasepolypeptide can be determined by expressing CPR with GFP-fused to itsC-terminus in Arabidopsis under the control of cauliflower mosaic virus35S promoter (see, Ro et al. (2002) Plant Physiology 130:1837-1851).Independently transformed T1 and T2 seedlings are then screened for thepresence of GFP by fluorescence microscopy and confocal microscopy (seeRo et al. (2002) Plant Physiology 130:1837-1851) or by immunoblotanalysis of microsomal proteins of seedlings. The functionality of theCPR in the GFP-CPR fusions can be verified using the C4H assay.

G. EXAMPLES

The following examples are included for illustrative purposes only andare not intended to limit the scope of the invention.

Example 1 Cloning and Sequencing of Santalum Album cDNA

In this example, RNA was extracted from wood samples of Sandalwood(Santalum album) trees and cDNA was generated and sequenced.

A. Isolation and Extraction of S. album RNA

Several 25 mm holes were drilled into the lower stems of mature Santalumalbum trees growing on land managed by the Forest Products Commission ofWestern Australia. Wood samples from the heartwood-sapwood transitionzone were collected and frozen immediately in liquid nitrogen. RNA wasextracted from 10 g tissue using a protocol modified from Kolosova etal., (2004) BioTechniques 36:821-824. After precipitation with LiCl, RNAwas stored at −80° C. until cDNA synthesis.

B. Generation of S. Album cDNA Library

S. album xylem total RNA (1.4 μg) was reverse transcribed withSuperScript III reverse transcriptase (Invitrogen) at 42° C. for 1 hourusing the SMART-Creator kit with the pDNR-LIB vector (Clontech; SEQ IDNO:20). The ligation mixture was transformed by electroporation into 25μL of phage resistant electrocompetent E. coli cells and Sangersequenced at the Genome Sciences Centre, Vancouver, Canada.

C. 454 Pyr Sequencing and Sanger Sequencing

Two cDNA libraries from Santalum album cores were prepared and sequencedwith Sanger technologies generating 11,520 paired end sequences. Oneplate of 454 Titanium sequencing was done on both libraries andgenerated 902,111 reads. Assembly was effected using the 454 and Sangersequences with Newbler assembler v2.6 (454 Life Sciences, RocheDiagnostics) with default parameters. This generated 31,461 contigs(isotigs).

Example 2 Identification of Nucleic Acid Encoding S. Album CytochromeP450 Polypeptides

Cytochrome P450 encoding genes were identified by comparing theassembled sequences (from Example 1) against a set of known plant P450encoding genes from the CYP76 families of P450 proteins using a BLASTxsearch (blast.ncbi.nlm.nih.gov; Altschul et al. (1990) J Mol Biol215:403-410).

Table 4 below provides a summary of 7 isotigs identified in the BLASTxsearch (blast.ncbi.nlm nih.gov; Altschul et al. (1990) J Mol Biol215:403-410), including the isotig, lowest E-value, the gene ID of thematch in the P450 database, the CYP450 family and the number of reads.The E-value (Expect Value) describes the number of matches expected tooccur randomly with a given score. In general, the smaller E-value, themore likely the match is significant.

TABLE 4 Summary of CYP450 transcripts Identity to Gene Lowest E- ID ofthe value match in the with match P450 database in P450 CrCYP76B6 CYP450Number of # Query data base (CAC80883) Family reads 1 isotig051828.34E−142 71% SaCYP76 910 2 isotig05183 2.68E−145 71% SaCYP76 763 3isotig05184 1.61E−78 52% SaCYP76 470 4 isotig06871 1.23E−126 83% SaCYP76110 5 isotig06872 9.19E−156 83% SaCYP76 118 6 isotig14788 1.53E−93 86%SaCYP76 11 7 isotig29133 1.49E−52 60% SaCYP76 1

Transcripts from this family were the most abundant in the EST databaseand cluster into four different groups. Group 1 is represented by 3isotigs (numbers 1-3 in Table 4) with a total of 2,143 reads including1,107 unique sequences generating a final assembled sequence of 1917base pairs (bp) with an open reading frame (ORF) of 1530 bp. Group 2 isrepresented by 2 isotigs (numbers 4-5 in Table 4), had 228 reads with140 unique reads generating an assembled sequence of 1776 bp and an ORFof 1530 bp. Group 3 (number 6 in Table 7) was represented by 11 readsgenerating a partial sequence of 1200 bp. Group 4 (number 7 in Table 7)is a singleton of 277 bp with several stop codons along the sequence.

Example 3 Isolation of Cytochrome P450 Encoding cDNA

Group 1 and Group 2 cDNA molecules (numbers 1-5 in the table above) ofthe CYP76 family identified in Example 2, were selected for cDNAisolation.

A. Cloning of Members of the CYP76 Family

Full-length cDNA molecules were amplified by polymerase chain reaction(PCR) with Phusion Hot Start II DNA Polymerase (Thermo Scientific) of S.album cDNA (set forth in SEQ ID NO:1) prepared as described in Example 1using gene specific primers designed according to the ORF of Group 1 andGroup 2 (set forth in Table 5 below). PCR conditions were as follows:

98° C. for 3 min;

2 cycles of: 98° C. for 10 sec, Tm −2° C. for 20 sec, 72° C. for 30 sec;

30 cycles of: 98° C. for 10 sec, Tm for 20 sec, 72° C. for 30 sec;

Final extension at 72° C. for 7 min

with a Tm of 55° C. for Isogroup 1 and a Tm of 52° C. for Isogroup 2.The PCR products were gel purified and cloned into the pJET1.2 vector(Fermentas, SEQ ID NO:21) according to the manufacturer's instructions.E. coli α-Select chemically competent cells (Bioline) were used forcloning and plasmid propagation. All constructs were verified by DNAsequencing.

TABLE 5 Primers for amplification of cytochrome P450 cDNA SEQ ID PrimerSequence NO Isogroup 1 ATGGACTTCTTAAGTTTTATCCTGTTTG 22 ForwardIsogroup 1 TTACCCCCGGATCGGGACAG 23 Reverse Isogroup 2ATGGACTTCTTAAGTTGTATCCTG 24 Forward Isogroup 2 TTACCCCCGGATTGGGACAG 25Reverse

Amplification with primers for Isogroup 1 resulted in a single uniquecDNA clone designated SaCYP76F38v1 (SaCYP76-G5). Amplification withprimers from Isogroup 2 resulted in 3 different cDNA clones designated:SaCYP76F39v1 (SaCYP76-G10), SaCYP76F37v1 (SaCYP76-G11) and SaCYP76F38v2(SaCYP76-G12). A second amplification with primers from Isogroup 2resulted in 6 additional different cDNA clones, designated SaCYP76F37v2(SaCYP76-G14), SaCYP76F39v2 (SaCYP76-G15), SaCYP76F40 (SaCYP76-G16),SaCYP76F41 (SaCYP76-G17), SaCYP76F42 (SaCYP76-G13) and SaCYP76F43(SaCYP76-G18). The SEQ ID NOS of the sequences of the nucleic acids andthe encoded amino acids are set forth in Table 6 below. The translatedamino acid sequences encoded by the 10 isolated cDNA molecules sharebetween 93% and 99% identity (see Table 7 below) and between 1.0 and6.6% divergence. Pair distances were prepared with ClustalW(slow/accurate, Gonnet weight matrix) (ebi.ac.uk/clustalw; EuropeanBioinformatics Institute).

TABLE 6 Cytochrome P450 Polypeptides Nucleic acid Amino acid CytochromeP450 SEQ ID NO SEQ ID NO SaCYP76F38v1 (SaCYP76-G5) 2 6 SaCYP76F39v1(SaCYP76-G10) 3 7 SaCYP76F37v1 (SaCYP76-G11) 4 8 SaCYP76F38v2(SaCYP76-G12) 5 9 SaCYP76F37v2 (SaCYP76-G14) 67 73 SaCYP76F39v2(SaCYP76-G15) 68 74 SaCYP76F40 (SaCYP76-G16) 69 75 SaCYP76F41(SaCYP76-G17) 70 76 SaCYP76F42 (SaCYP76-G13) 71 77 SaCYP76F43(SaCYP76-G18) 72 78

TABLE 7 Percent amino acid identity for cytochrome P450s from the CYP76family SaCYP76 F38v1 F39v1 F37v1 F38v2 F37v2 F39v2 F40 F41 F42 F43SaCYP76F38v1 100 94 97 99 98 93 94 96 95 96 SaCYP76F39v1 100 95 94 95 9998 96 95 95 SaCYP76F37v1 100 98 99 95 94 95 94 95 SaCYP76F38v2 100 99 9494 96 95 95 SaCYP76F37v2 100 95 93 95 94 95 SaCYP76F39v2 100 98 95 94 95SaCYP76F40 100 97 96 95 SaCYP76F41 100 97 94 SaCYP76F42 100 96SaCYP76F43 100

Example 4 Sequence and Phylogenetic Analysis of SaCYP76 Proteins

A BLASTx search of the deduced amino acid sequences against the GenBanknon-redundant protein database (blast.ncbi.nlm.nih.gov; Altschul et al.(1990) J Mol Biol 215:403-410) identified a putative cytochrome P450from Vitis vinifera (GenBank Accession No. XP_002281735; SEQ ID NO:26)that has 62% to 64% sequence identity to the S. album CYPs and a CYP76B6geraniol hydroxylase from Catharanthus roseus (GenBank Accession No.CAC80883; Collu et al. (2001) FEBS Lett 308:215-220; SEQ ID NO:27) thathas 54% to 55% sequence identity to the S. album CYPs. Protein alignmentof the full length protein sequences was made with ClustalW(ebi.ac.uk/clustalw; European Bioinformatics Institute).

Phylogenetic trees were constructed with MEGA version 4 (Centre forEvolutionary Medicine and Informatics; Tamura et al., 2007 Mol Biol Evol24:1596-1599) employing the neighbor joining (NJ) method with defaultparameters. Bootstrap (500 replications) confidence values over 50% aredisplayed at branch points. The neighbor-joining phylogeny of thepredicted protein sequences of the initial four S. album CYP clonesSaCYP76F38v1 (SaCYP76-G5), SaCYP76F39v1 (SaCYP76-G10), SaCYP76F37v1(SaCYP76-G11) SaCYP76F38v2 (SaCYP76-G12) and cytochrome P450 enzymes forterpenoid metabolism in other species is set forth in FIG. 4. TheSaCYP76 genes, which form a separate cluster in this phylogeny, are mostclosely related to the CYP76B cluster that includes geraniol/nerolhydroxylases from different species. Accession numbers of the amino acidsequences included in the phylogeny in FIG. 4, in addition to the S.album CYP76 P450 clones SaCYP76F38v1 (SaCYP76-G5), SaCYP76F39v1(SaCYP76-G10), SaCYP76F37v1 (SaCYP76-G11) SaCYP76F38v2 (SaCYP76-G12)provided herein, included: Helianthus tuberosus CYP76B1 (CAA71178; SEQID NO:28); Catharanthus roseus CYP76B6 (CAC80883; SEQ ID NO:27); Swertiamussotii CYP76B6 (ACZ48680; SEQ ID NO:29); Persea americana CYP71A1(P24465; SEQ ID NO:30); Mentha×piperita CYP71A32 (Q947B7; SEQ ID NO:31);Artemisia annua CYP71AV1 (ABB82944; SEQ ID NO:32); Cichorium intybusCYP71AV8 (ADM86719; SEQ ID NO:33); Lactuca sativa CYP71BL1 (AEI59780;SEQ ID NO:34); Nicotiana tabacum CYP71D20 (Q94FM7; SEQ ID NO:35);Mentha×piperita CYP71D13 (Q9XHE7; SEQ ID NO:36); Mentha spicata CYP71D18(Q6WKZ1; SEQ ID NO:37); Catharanthus roseus CYP72A1 (Q05047; SEQ IDNO:38); and Oryza sativa CYP76M7 (AK105913; SEQ ID NO:39).

A second neighbor joining phylogenetic tree was constructed with all 10S. album CYP76F proteins and related terpene-modifying cytochrome P450smembers of the CYP71 clan, using Picea sitchensis PsCYP720B4 (ADR78276;SEQ ID NO:79) as an outgroup. The phylogenetic tree is set forth in FIG.10. The S. album CYP76F proteins fell into two separate clades and wereclosest to the CYP76B cluster of other species. Clade Isantalene/bergamotene oxidases included SaCYP76F39v1 (SaCYP76-G10),SaCYP76F39v2 (SaCYP76-G15), SaCYP76F40 (SaCYP76-G16), SaCYP76F41(SaCYP76-G17) and SaCYP76F42 (SaCYP76-G13). Clade II bergamoteneoxidases included SaCYP76F37v1 (SaCYP76-G11), SaCYP76F37v2(SaCYP76-G14), SaCYP76F38v1 (SaCYP76-G5) and SaCYP76F38v2 (SaCYP76-G12).Accession numbers of the amino acid sequences for otherterpene-modifying CYPs included in the phylogenetic tree in FIG. 10, inaddition to the S. album CYP76 P450 clones, include CaCYP76B4Camptotheca acuminate putative geraniol-10-hydroxylase (AES93118; SEQ IDNO:80); CrCYP76B6 Catharanthus roseus geraniol 10-hydroxylase (Q8VWZ7;SEQ ID NO:81); SmCYP76B4 Swertia mussotii geraniol 10-hydroxylase(D1MI46; SEQ ID NO:82); OsCYP76M7 Oryza sativa ent-cassadieneC11a-hydroxylase (NP_001047185; SEQ ID NO:83); MpCYP71A32Mentha×piperita menthofuran synthase (Q947B7; SEQ ID NO:84); PaCYP71A1Persea americana (P24465; SEQ ID NO:85); CiCYP71AV8 Cichoriium intybusvalencene oxidase (ADM86719; SEQ ID NO:86); MpCYP71D13 Mentha×gracilis(−)-limonene-3-hydroxylase (AY281027; SEQ ID NO:87); NtCYP71D20Nicotiana tabacum 5-epi-aristocholene-1,3-dihydroxylase (AF368376; SEQID NO:88); and GaCYP706B1 Gossypium arboretum(+)-delta-cadinene-8-hydroxylase (AAK60517; SEQ ID NO:89).

Example 5 Cytochrome P450 Reductase

Cytochrome P450 reductase encoding genes were identified by comparingthe assembled sequences with a set of known plant cytochrome P450reductases from Arabidopsis (CAB58575.1 (SEQ ID NO:58) and CAB58576.1(SEQ ID NO:46)). Full length cDNA genes SaCPR1 and SaCPR2 were amplifiedby polymerase chain reaction (PCR) with Phusion Hot Start II DNAPolymerase (Thermo Scientific) of S. album cDNA prepared as described inExample 1 with gene specific primers designed according to the ORF ofthe cytochrome P450 reductase (set forth in Table 8).

TABLE 8 Primers for PCR of cytochrome P450 reductase genes SEQ ID PrimerSequence Tm NO SaCPR1 ATG AGT TCG AGC TCG GAG CTA TG 57 40 ForwardSaCPR1 TCA CCA CAC ATC CCG TAA ATA CCT 57 41 Reverse TC SaCPR2ATG CAA TTG AGC TCC GTC AAG 58 61 Forward SaCPR2TCA CCA CAC ATC CCG TAA ATA CCT 58 62 Reverse TCCPCR conditions were as follows:

98° C. for 3 min;

2 cycles of: 98° C. for 10 sec, Tm −2° C. for 20 sec, 72° C. for 30 sec;

30 cycles of: 98° C. for 10 sec, Tm for 20 sec, 72° C. for 30 sec;

Final extension at 72° C. for 7 min

The PCR products were gel purified and cloned directly into thepET28b(+) vector (SEQ ID NO:51) or first cloned into pJET vector andthen subcloned into expression vectors. E. coli α-Select chemicallycompetent cells (Bioline) were used for cloning and plasmid propagation.All constructs were verified by DNA sequencing. PCR amplificationresulted in two S. album cytochrome P450 reductase (CPR) clonesdesignated CPR1 and CPR2, having nucleic acid sequences set forth in SEQID NOS:10 and 11, respectively, encoding the proteins set forth in SEQID NO:12 and 13. The two CPR nucleic acid sequences share 70% sequenceidentity and the two CPR proteins share 82% sequence identity.

The web-based BlastX program (Altschul et al., (1990) J. Mol. Biol.215:403-410) was then used to compare the sequence of the identifiedwith sequences in the GenBank database. The CPR sequences share 79%sequence homology with the Vitis vinifera predicted cytochrome P450reductase-like protein (Genbank Accession No. XP_002270732; SEQ IDNO:42), 78% sequence homology with the Gossypium hirsutum cytochromeP450 reductase (Genbank Accession No. ACN54324; SEQ ID NO:43) and 75%sequence homology with the Artemisia annua cytochrome P450 reductase(Genbank Accession No. ABI98819; SEQ ID NO:44).

Truncated CPRs were generated containing amino acids 44-692 of SEQ IDNO:12 (truncated protein sequence set forth in SEQ ID NO:14; nucleicacid sequence set forth in SEQ ID NO:63) and amino acids 61-704 of SEQID NO:13 (truncated protein sequence set forth in SEQ ID NO:15; nucleicacid sequence set forth in SEQ ID NO:64).

Activity of recombinant SaCPR was assayed using the Cytochrome CReductase (NADPH) assay kit (Sigma).

Example 6 Gas Chromatography-Mass Spectrometry Analysis

Gas chromatography-mass spectrometry (GC-MS) analysis was used toanalyze the oxidation products of the S. album cytochrome P450s and S.album oil.

A. SGE Solgel-Wax Capillary Column

GC-MS analysis was performed on a Agilent 6890A/5973N GC-MS systemcontaining a SGE Solgel-Wax capillary column (30 m×0.25 mm ID×0.25μmthickness) in SIM-scan mode (scan: m/z 40-400; SIM: m/z 93, 94, 119,136, 122, 202 and 204 [dwell time 50]. Volumes of 2 μL samples wereinjected in pulsed splitless mode at 250° C. with a column flow of 1mL/min helium and 50 psi pulse pressure for 0.5 min with the followingprogram: 40° C. for 2 min, ramp of 8° C. per min to 100° C., 15° C. permin to 250° C., hold 5 min.

Alternatively, the following program was also used to analyze theproducts of S. album SaCYP76F39v1 (SaCYP76-G10) and S. album oil:volumes of 2 μL samples were injected in pulsed splitless mode at 250°C. with a column flow of 0.8 mL/min helium and 10 psi pulse pressure for0.05 min with the following program: 40° C. for 3 min, 10° C. per min to100° C., 2° C. per min to 250° C., hold 10 min.

Product identification was based on best match of the MS fragmentationpatterns with entries in the NIST and Wiley libraries (Wiley Registry®9th Edition/NIST 2011; Fred W. McLafferty, John Wiley & Sons, Inc.) andby comparison with compounds of authentic S. album oil and Kovats indexvalues.

B. HP5 and DB-Wax Fused Silica Column

GC-MS analysis was performed on a Agilent 7890A/5975C GC-MS systemoperating in electron ionization selected ion monitoring (SIM)-scanmode. Samples were analyzed on an HP5 (non-polar; 30 m×0.25 mm ID×0.25μm thickness) and a DB-Wax fused silica column (polar; 30 m×0.25 mmID×0.25 μm thickness). In both cases, the injector was operated inpulsed splitless mode at with the injector temperature maintained at250° C. Helium gas was used as the carrier gas with a flow rate of 0.8mL/min and pulsed pressure set at 25 psi for 0.5 min. Scan range: m/z40-500; SIM: m/z 93, 94, 105, 107, 119, 122 and 202 [dwell time 50msec].

The oven program for the HP5 column was:

40° C. for 3 min, ramp of 10° C. per min to 130° C., 2° C. per min to180° C., 50° C. per min to 300° C., hold 300° C. for 10 min.

The oven program for the DB-wax column was:

40° C. for 3 min, ramp of 10° C. per min to 130° C., 2° C. per min to200° C., 50° C. per min to 250° C., hold 250° C. for 15 min.

Chemstation software was used for data acquisition and processing.Compounds were identified by comparison of mass spectral with authenticsamples and the NIST/EPA/NIH mass spectral library v2.0 and bycomparison of retention indices with those appearing in Valder et al.(2003) J Essent Oil Res 15:178-186 and Sciarrone et al. (2011) JChromatogr A 1218:5374.

Example 7 Expression in Bacteria and Yeast

The S. album FPP synthase, santalene synthase, cytochrome P450SaCYP76F38v1 (SaCYP76-G5) and cytochrome P450 reductase genes werecloned into a pCDF-Duet (Novagen) and pACYC-Duet (Novagen) bacterialexpression vectors. Genes encoding the full length S. album cytochromeCYP76F P450s, cytochrome P450 reductase, santalene synthase and farnesyldiphosphate synthase were cloned into various yeast expression vectorsto allow expression in the Saccharomyces cerevisiae yeast strain BY4741(MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0; ATCC #201388).

A. Bacterial Expression Vectors

Genes encoding FPP synthase (SEQ ID NO:18) and santalene synthase (SEQID NO:16), previously characterized from S. album (see, InternationalPCT application No. WO2011000026 and Jones et al. (2011) J Biol Chem286:17445-17454), were cloned into the bacterial expression vectorpCDF-Duet (Novagen, SEQ ID NO:65) generating pCDF-Duet:SaFPPS:SaSSy.Genes encoding SaCPR (SEQ ID NO:11) and SaCYP76F38v1 (SaCYP76-G5) gene(SEQ ID NO:2) were cloned into the bacterial expression vectorpACYC-Duet (Novagen, SEQ ID NO:45) generatingpACYC-Duet:SaCPR:SaCYP76F38v1. These expression vectors are dualexpression vectors that allow co-expression of two target genes via twomultiple cloning sites.

pCDF-Duet:SaFPPS:SaSSy, which has a streptomycin selectable marker, wastransformed into chemically competent C41 (DE3) E. coli cells (Avidis).These cells were grown up and rendered chemically competent again usingcalcium chloride, and transformed with thepACYC-Duet:SaCPR:SaCYP76F38v1, which has a chloramphenicol selectablemarker. Both antibiotics were used to select for colonies containingboth duet vectors. These colonies were grown overnight in a rich media(terrific broth) at 16° C. and protein expression was initiated throughthe addition of IPTG. Cytochrome P450 protein expression wassupplemented with 5-amino-levulinic acid to aid in porphyrin synthesis,and evidenced by a reddening of the cell pellet.

B. Generation of Yeast Expression Vectors

1. S. Album Cytochrome P450s

The S. album CYP76F full length cDNAs identified in Table 6 above weresub-cloned into the yeast expression vector pYeDP60 (Cullin and Pompon(1988) Gene 65:203-217; Pompon et al. (1996) Methods Enzymol 272:51-64;Abecassis et al. (2003) Methods Mol Biol 231:165-173) following theuracil-excision (USER) cloning technique of Hamann and Moller (2007)Protein Expr Purif 56:121-127. The pYeDP60 vector contains a URA marker.The resulting constructs are set forth in Table 9 below.

2. S. Album Santalene Synthase and Farnesyl Diphosphate Synthase

Santalene synthase encoding cDNA (SaSSY, SEQ ID NO:16) and farnesyldiphosphate synthase encoding cDNA (SaFPPS, SEQ ID NO:18) were clonedinto the NotI-Bgl II and BamHI-XhoI sites, respectively, of thegalactose inducible expression vectors pESC-LEU (Stratagene, SEQ IDNO:47) or pESC-LEU2d (see, Ro et al. (2008) BMC Biotechnology 8:83) byin-Fusion Cloning (Clontech) following the manufacturer's instructions.Additional vectors were generated containing only the SaSSy gene (SEQ IDNO:16). The pESC-LEU and pESC-LEU2d vectors contain a LEU marker and thepESC-LEU2d vector is a high copy number vector containing a deletion inthe Leu2 promoter. The resulting constructs are set forth in Table 9below.

3. Cytochrome P450 Reductase

Cytochrome P450 reductase encoding cDNA (SaCPR, SEQ ID NO:11),identified in Example 3, was cloned into the EcoRi-NotI sites ofpESC-HIS vector (Stratagene, SEQ ID NO:49) by in-Fusion Cloning(Clontech) following the manufacturer's instructions. The resultingconstructs are summarized in Table 9 below.

TABLE 9 Yeast expression vectors Construct ID Marker Description (MCS =multiple cloning site) pESC- -LEU MCS1 contains S. album SantaleneLEU:SaG1:SaG2 Synthase (SaSSY) MCS2 contains S. album FPPS (SaFPPS)pESC- -LEU MCS1 contains S. album Santalene LEU2d:SaG1:SaG2 Synthase(SaSSY) MCS2 contains S. album FPPS (SaFPPS) pESC-LEU:SaSSY -LEU MCS1contains S. album Santalene Synthase (SaSSY) pESC- -LEU MCS1 contains S.album Santalene LEU2d:SaSSY Synthase (SaSSY) pESC-His:SaCPR -HIS MCS1contains S. album cytochrome P450 reductase (SaCPR) pYEDP60:F38v1 -URApYEDP60 contains S. album SaCYP76F38v1 (SaCYP76-G5) pYEDP60:F39v1 -URApYEDP60 contains S. album SaCYP76F39v1 (SaCYP76-G10) pYEDP60:F37v1 -URApYEDP60 contains S. album SaCYP76F37v1 (SaCYP76-G11) pYEDP60:F38v2 -URApYEDP60 contains S. album SaCYP76F38v2 (SaCYP76-G12) pYEDP60:F37v2 -URApYEDP60 contains S. album SaCYP76F37v2 (SaCYP76-G14) pYEDP60:F39v2 -URApYEDP60 contains S. album SaCYP76F39v2 (SaCYP76-G15) pYEDP60:F40 -URApYEDP60 contains S. album SaCYP76F40 (SaCYP76-G16) pYEDP60:F41 -URApYEDP60 contains S. album SaCYP76F41 (SaCYP76-G17) pYEDP60:F42 -URApYEDP60 contains S. album SaCYP76F42 (SaCYP76-G13) pYEDP60:F43 -URApYEDP60 contains S. album SaCYP76F43 (SaCYP76-G18)

C. Yeast Transformation and Expression

All constructs were transformed into the Saccharomyces cerevisiae yeaststrain BY4741 (MATa his3Δ1 leu2Δ0 met15Δ0 ura3Δ0; ATCC #201388) usingthe LiCl method as described in Gietz et al. (1992) Nucleic Acids Res20:1425. Transformed yeast were selected on plates with appropriatesynthetic drop-out selection medium and grown at 30° C. for 48 hours.

1. Expression of Santalene Synthase

Production of santalenes and bergamotene was evaluated using constructsencoding the S. album santalene synthase. Yeast cells expressing thehigh copy number construct pESC-LEU2d:SaSSY produced about twice theamount of santalenes and bergamotene as determined by GC-MS (asdescribed in Example 6A) compared to yeast cells expressing thepESC-LEU:SaSSY construct. No differences were observed between the cellsexpressing the santalene synthase in the presence or absence of farnesyldiphosphate synthase, indicating that FPP produced by yeast enzymes wasaccessible for S. album santalene synthase to produce santalenes andbergamotene. The high copy number construct pESC-LEU2d:SaSSY was usedfor further experiments.

2. Expression of Santalene Synthase and Cytochrome P450 Reductase

The pESC-LEU2d:SaSSY construct encoding santalene synthase and thepESC-His:SaCPR construct encoding S. album cytochrome P450 reductase(SaCPR) were co-transformed into the yeast strain BY4741. SaCPR wasincluded to supply electrons from NADPH to the CYP450.

Example 8 Microsome Preparation

In order to purify the S. album cytochrome P450 enzymes for use in invitro assays, microsomes were prepared. Microsomes contain fragmentedendoplasmic reticulum (ER) which contains cytochrome P450. Thus,purification of microsomes results in concentrated and isolatedcytochrome P450. CO spectra of recombinant P450s encoded by the S. albumCYP76F P450s was measured according to Guengerich et al. (2009) NatProtoc 4:1245-1251.

Microsome membranes were prepared from 250 mL yeast cultures accordingto Pompom et al. (1996) Methods Enzymol 2(71):51-64. In brief, a 5 mLovernight culture was used to inoculate 50 mL of SD-selective mediastarting at an OD600 of 0.2 and grown at 30° C., 170 rpm for 24 hours. Avolume of 200 mL YPDE medium (1% yeast extract, 2% bacto-peptone, 5%ethanol, 2% dextrose) was inoculated with the 50 mL culture andincubated for another 24 hours at 30° C., 170 rpm. Cells were collectedby centrifugation for 10 min at 1,000×g and induced with 2% galactose in250 mL YP medium at 30° C., 170 rpm for 12-16 hours. For microsomeisolation, yeast cells were pelleted by centrifugation at 2,000×g for 10min, washed once with 5 mL TEK (50 mM Tris-HCl pH 7.5, 1 mM EDTA, 100 mMKCl) and resuspended in TES2 buffer (50 mM Tris-HCl pH 7.5, 1 mM EDTA,600 mM Sorbitol, 5 mM DTT and 0.25 mM PMSF). All subsequent steps wereperformed at 4° C. Yeast cell walls were disrupted mechanically usingacid-washed glass beads (425-600 μm, Sigma) and vigorous manual shakingfor 3×30 sec. The cell homogenate was centrifuged at 10,000×g for 15 minfollowed by ultracentrifugation of the supernatant at 100,000×g for 1hour to collect membranes. Microsomes were resuspended and homogenizedin a buffer containing 50 mM Tris-HCl buffer pH 7.5, 1 mM EDTA and 30%(v/v) glycerol, and used directly for enzyme assays or stored at −80° C.

Microsome preparations for all ten S. album CYP76Fs except SaCYP76F43(SaCYP76-G18) displayed characteristic P450 CO difference spectra (seeFIG. 18). The P450 content of the microsomal preparations ranged from0.2 to 1.6 μM. Microsome preparations were screened for P450 activity asdescribed in Example 11 below.

Example 9 Generation and Isolation of Sesquiterpene Olefins

The sesquiterpene olefins α-santalene, β-santalene, epi-β-santalene andα-trans-bergamotene are not commercially available but can be producedby expression of S. album santalene synthase (SaSSY; SEQ ID NO:16) inyeast as described in Jones et al. (2011) J Biol Chem 286:17445-17454.

A sesquiterpene oil containing α-santalene, β-santalene, epi-β-santaleneand α-trans-bergamotene was produced in an industrial scalefermentation. The mixture was separated using silver nitrate impregnatedTLC plates according to Daramwar et al. (Analyst 137:4564-4570 (2012)).Fractions were scraped from the TLC plates and the sesquiterpenes wereeluted with pentane followed by GC-MS analysis for purity. The extractedion chromatograms are shown in FIGS. 19A-19D for the oil containingα-santalene, β-santalene, epi-β-santalene and α-trans-bergamotene (FIG.19A), α-santalene (peak 1, FIG. 19B), α-trans-bergamotene (peak 2, FIG.19C) and epi-β-santalene and β-santalene (peaks 3 and 4, FIG. 19D). Theisolated sesquiterpenes were used in in vitro assays in Example 11below.

Example 10 Functional Characterization of S. Album Cytochrome P450Activity in S. Cerevisiae

The S. cerevisiae yeast host strain containing active santalene synthaseand cytochrome P450 reductase described in Example 7.C.2. was used toexpress the S. album cytochrome CYP76F P450s identified in Example 2above. Activity was assessed by measurement of in vivo formation ofoxidation products as described in Section A below. Each S. album CYP76Fin a pYeDP60 vector was transformed individually into the yeast hostcell expressing santalene synthase and CPR. A control strain wasgenerated that contained the empty pYeDP60 vector.

A. In Vivo P450 Assays in Yeast

For in vivo assays, yeast were grown overnight at 30° C. in 5 mL of 2%dextrose and minimal selective media. The next day, a 50 mL culture wasinitiated at a starting OD600 of 0.2 and grown at 30° C. with shaking at170 rpm until the culture reached an OD600 of 0.6-0.8. Proteinexpression was initiated by transfer into minimal selective media with2% galactose and grown for about 14-16 hours. Yeast cells were harvestedby centrifugation at 1,000×g for 10 min and washed once with 5 mLsterile ddH₂O. Cells were extracted twice with 2 mL hexane: ethylacetate (85:15) using about 250 μL acid-washed glass beads (425-600 μm,Sigma) and vortexing for 1 min. Pooled extracts were transferred to aclean test-tube containing anhydrous Na₂SO₄ and evaporated under agentle stream of N₂ gas to about 200 μL. The samples were transferred toa GC glass vial for GC-MS analysis (as described in Example 6) or storedat −80° C.

B. Clade I Santalum Album P450s

Clade I S. album P450s SaCYP76F39v1 (SaCYP76-G10), SaCYP76F39v2(SaCYP76-G15), SaCYP76F40 (SaCYP76-G16), SaCYP76F41 (SaCYP76-G17) andSaCYP76F42 (SaCYP76-G13) were assayed for their activity in vivo withGC-MS analysis as described in Example 6A or 6B.

1. SaCYP76F39v1 (SaCYP76-G10) with GC-MS Analysis as Described inExample 6A

Co-expression of santalene synthase and SaCYP76F39v1 (SaCYP76-G10)resulted in the detection of 11 product peaks identified as α-, β- andepi-β-santalol and α-trans-bergamotol (see FIGS. 8A-8B and Table 11below). Nine (9) of the 11 products were also detected in the S. albumoil, albeit in different ratios, as shown in FIGS. 8A and 8B. Theproducts were identified based on matches of the MS fragmentationpatterns with entries in the NIST and Wiley libraries (Wiley Registry®9^(th) Edition/NIST 2011; Fred W. McLafferty, John Wiley & Sons, Inc.)and by comparison with compounds of authentic S. album oil and Kovatsindex values (See FIG. 8A and Table 11). The main components of S. albumoil are α-santalol, Z-α-trans-bergamotol, E-cis, epi-β-santalol andtrans-β-santalol whereas the main products from SaCYP76F39v1(SaCYP76-G10) are cis-α-santalol, α-santalol and trans-β-santalol. Thesedifferences can be due to different physiological conditions, such aspH, under which the SaSSy and SaP450 enzymes are active in the yeastcells and in the trees, or they can be due to changes in the ratios ofproducts over time. The products monitored in yeast were formed andaccumulated over a period of hours, while oil extracted from trees ispotentially the product of years of accumulation. Farnesol (labeled #),which is produced by yeast independent of the expression of santalenesynthase, and dodecanoic acid (labeled *), which is extracted fromyeast, were also observed (see FIGS. 8B and 8C).

TABLE 11 Terpenoids identified in in vivo assay with SaCYP76F39v1(SaCYP76-G10) and S. album oil Re- Compounds tention Retention Productsdetected from detected in Peak Time Index¹ SaCYP76F39v1 S. album oil  132.23 2169 unknown isomer of traces α-trans-bergamotol  2 35.2 2214unknown Yes  3 35.8 2228 unknown isomer of α-santalol No  4 38.5 2294cis-α-santalol Yes  5a 39.1 2308 unknown isomer of α-santalol No  5b39.1 2308 α-trans-bergamotol Yes  6 40.0 2331 unknown isomer ofα-santalol Yes  7 40.4 2341 unknown isomer of Yes α-trans-bergamotol  841.1 2359 Epi-β-santalol Yes  9 41.7 2374 β-santalol Yes 10 42.7 2399unknown isomer of β-santalol Yes 11 43.2 2412 unknown isomer ofβ-santalol Yes *Dodecanoic acid, extracted from yeast; # Farnesol,product of yeast. ¹Linear retention indices (LRI) measured on a SGESolgel-Wax column

2. SaCYP76F39v1 (SaCYP76-G10) with GC-MS Analysis as Described inExample 6B

Co-expression of santalene synthase and SaCYP76F39v1 (SaCYP76-G10)resulted in the detection of eight products identified as (Z)- and(E)-α-santalol (peaks 5 and 7), (Z)- and (E)-β-santalol (peaks 6 and 8),(Z)- and (E)-epi-β-santalol (peaks 9 and 11) and (Z)- and(E)-α-trans-bergamotol (peaks 10 and 12) (see FIG. 11A). Table 12 belowsets forth the peak number, compound and linear retention indices forthe DBwax column and the HP5 column. Product identification was based onbest match of the MS fragmentation patterns with entries in the NIST andWiley libraries (Wiley Registry® 9^(th) Edition/NIST 2011; Fred W.McLafferty, John Wiley & Sons, Inc.) and by comparison with compounds ofauthentic S. album oil and Kovats index values. As shown in the figure,the product peak for (Z)-α-trans-bergamotol overlapped with a peakcorresponding to (E,E)-farnesol, which was produced in yeast independentof SaCYP76F39v1 (SaCYP76-G10) (see FIG. 11B).

A fraction of the sesquiterpenols produced were modified to unidentifiedcompounds (identified with hash tags (#) in FIG. 11A). Whenuntransformed yeast cells were incubated with authentic sandalwood oil,the same unknown compounds were identified implying that theseunidentified compounds are not direct products of SaCYP76F39v1(SaCYP76-G10) but are produced by an endogenous activity of yeastconverting sandalwood sesquiterpenols (see FIGS. 12A-12B).

TABLE 12 Retention indices of sesquiterpenes and sesquiterpenols LRI¹LRI² Peak Compound DBwax HP5 1 α-santalene 1579 1423 2α-trans-bergamotene 1592 1437 3 epi-β-santalene 1637 1450 4 β-santalene1652 1463 5 (Z)-α-santalol 2343 1676 6 (Z)-α-trans-bergamotol 2353 16927 (E)-α-santalol 2382 1697 8 (E)-α-trans-bergamotol 2389 1711 9(Z)-epi-β-santalol 2409 1703 10 (Z)-β-santalol 2423 1717 11(E)-epi-β-santalol (tentative) 2452 1726 12 (E)-β-santalol 2465 1738¹Linear retention indices (LRI) measured on a DBwax column. ²Linearretention indices (LRI) measured on a HP5 column.

3. SaCYP76F39v2 (SaCYP76-G15), SaCYP76F40 (SaCYP76-G16), SaCYP76F41(SaCYP76-G17) and SaCYP76F42 (SaCYP76-G13)

SaCYP76F39v2 (SaCYP76-G15), SaCYP76F40 (SaCYP76-G16), SaCYP76F41(SaCYP76-G17) and SaCYP76F42 (SaCYP76-G13) were assayed for theirability to oxidize sesquiterpenes using the in vivo assay describedabove with GC-MS analysis described in Example 6B. Co-expression ofsantalene synthase and SaCYP76F39v2 (SaCYP76-G15), SaCYP76F40(SaCYP76-G16), SaCYP76F41 (SaCYP76-G17) or SaCYP76F42 (SaCYP76-G13) gaveproduct profiles with nearly identical ratios to those observed forSaCYP76F39v1 (SaCYP76-G10) (see Table 12 and FIGS. 13A-13D).

C. Clade II Santalum Album P450s

Clade II S. album P450s SaCYP76F37v1 (SaCYP76-G11), SaCYP76F38v2(SaCYP76-G12), SaCYP76F37v2 (SaCYP76-G14) and SaCYP76F38v1 (SaCYP76-G5)were assayed for their activity in vivo with GC-MS analysis as describedin Example 6A or 6B.

1. SaCYP76F38v1 (SaCYP76-G5), SaCYP76F37v1 (SaCYP76-G11) andSaCYP76F38v2 (SaCYP76-G12) with GC-MS Analysis as Described in Example6A

Co-expression of santalene synthase with SaCYP76F38v1 (SaCYP76-G5),SaCYP76F37v1 (SaCYP76-G11) or SaCYP76F38v2 (SaCYP76-G12) in therecombinant yeast system resulted in virtually identical products (seeFIGS. 6A, 6B and 6C and Table 13 below). The products were identifiedbased on matches of the MS fragmentation patterns with entries in theNIST and Wiley libraries (Wiley Registry® 9^(th) Edition/NIST 2011; FredW. McLafferty, John Wiley & Sons, Inc.) and by comparison with compoundsof authentic S. album oil (See FIG. 7 and Table 13). Peaks 1 and 7,which were observed for SaCYP76F38v1 (SaCYP76-G5), SaCYP76F37v1(SaCYP76-G11) and SaCYP76F38v2 (SaCYP76-G12), correspond toα-trans-bergamotol, possibly representing different isomers. Peaks 1 and7 were also observed for S. album oil (see FIG. 7 and Table 13). A thirdpeak (labeled #) with a retention time of approximately 18 minutes wasidentified as farnesol, which is produced by yeast independent of theexpression of santalene synthase and SaCYP76, as observed by itsexpression in the control cells containing an empty vector (FIG. 6D).

TABLE 13 Terpenoids identified in in vivo assay with SaCYP76-F38v1,-F37v1, -F38v2 and S. album oil Compounds Retention Products detectedfrom detected in Peak Time CYP76-F38v1, -F37v1, -F38v2 S. album oil 117.64 unknown isomer of α-trans-bergamotol traces 4 18.00 cis-α-santalolYes 5b 18.05 α-trans-bergamotol Yes 7 18.15 unknown isomer ofα-trans-bergamotol Yes 8 18.40 Epi-β-santalol Yes 9 18.50 β-santalol Yes# Farnesol, product of yeast. ¹Linear retention indices (LRI) measuredon a SGE Solgel-Wax column

2. SaCYP76F38v1 (SaCYP76-G5), SaCYP76F37v1 (SaCYP76-G11) or SaCYP76F38v2(SaCYP76-G12) or SaCYP76F37v2 (SaCYP76-G14)

SaCYP76F38v1 (SaCYP76-G5), SaCYP76F37v1 (SaCYP76-G11) or SaCYP76F38v2(SaCYP76-G12) or SaCYP76F37v2 (SaCYP76-G14) were assayed for theirability to oxidize sesquiterpenes using the in vivo assay describedabove with GC-MS analysis described in Example 6B. Co-expression ofsantalene synthase with SaCYP76F38v1 (SaCYP76-G5), SaCYP76F37v1(SaCYP76-G11) or SaCYP76F38v2 (SaCYP76-G12) or SaCYP76F37v2(SaCYP76-G14) in the recombinant yeast system resulted in mostlyE-α-trans-bergamotene (peak 8 in Table 12) with only traces of(E)-α-santalol and (E)-β-santalol (peaks 7 and 12 in Table 12) (seeTable 12 and FIGS. 14A-14D)

D. SaCYP76F43 (SaCYP76-G18)

SaCYP76F43 (SaCYP76-G18) was assayed for its ability to oxidizesesquiterpenes using the in vivo assay described above with GC-MSanalysis described in Example 6B. No activity was observed afterco-expression of santalene synthase with SaCYP76F43 (SaCYP76-G18) (seeFIG. 14E).

E. SaCPR1 and SaCPR2

To test if SaCPR1 and SaCPR2, which are 70% identical at the proteinlevel, could affect changes in the product profiles, both CPRs weretested as indicated in Example 6B with representative class I and classII SaCYP76Fs SaCYP76F39v1 and SaCYP76F38v1. No differences were observedin the products and relative abundances as compared to those describedin Sections B.2. and C.2. above.

Example 11 In Vitro Enzymatic Assays

Yeast microsomes containing a S. album cytochrome P450 and a cytochromeP450 reductase, generated in Example 8, were assayed for their abilityto oxidize santalenes and bergamotene using either A) a coupled enzymeassay with the in vitro reaction products of SaSSy and FPP; B) anisolated mixture of santalenes and bergamotene as the substrate; or C)individual santalenes or bergamotene as the substrate.

A. Oxidation of Santalenes and Bergamotene Using a Coupled Enzyme Assay

Coupled enzyme assays with S. album santalene synthase (SaSSy) expressedin bacteria (Jones et al. (2011) J Biol Chem 286:17445-17454) wereinitiated with 50 μg of His₆-tag purified SaSSy and 70 μM farnesylpyrophosphate (FPP) in TPS buffer (25 mM HEPES pH 7.5, 5 mM MgCl₂, 1 mMDTT) in a volume of 450 μL. The assays were incubated for 30 min at 30°C. followed by the addition of 50 μL of the microsome preparationcontaining a S. album cytochrome P450 and a cytochrome P450 reductaseand 0.8 mM NADPH. The reaction was incubated for an additional 1 hour at30° C. and was stopped by extraction with 500 μL hexane/ethyl acetate(85:15). The organic layer was concentrated under a gentle stream of N₂gas to about 100 μL and analyzed by GC-MS analysis (as described in 6Aabove) or was stored at −80° C.

1. SaCYP76F38v1 (SaCYP76-G5)

The coupled enzyme assay was performed in vitro with SaCYP76F38v1(SaCYP76-G5) and compared to the in vivo results to verify the utilityof the assay. GC-MS analysis of the reaction products from the coupledassay showed the same two peaks identified in the in vivo assay inExample 8. In both assays, SaCYP76F38v1 (SaCYP76-G5) catalyzed thehydroxylation of bergamotene into Z-α-trans-bergamotol but did notcatalyze the oxidation of any santalenes.

B. Oxidation of a Mixture of Santalenes and Bergamotene

S. album P450s were assayed for their sesquiterpene oxidase activitiesusing a mixture of santalenes and bergamotene as the substrate.

1. Assays

Two different in vitro assays were used to screen the S. album CYP76Fsfor sesquiterpene oxidase activity.

a. In Vitro Assay 1

Assays were performed in 400 μL reaction volumes containing 150 μLpotassium phosphate buffer 100 mM (pH 7.5), 20 μL 20 mM NADPH, 1 μL of25 mM santalene/bergamotene mixture [containing α-santalene,epi-β-santalene, β-santalene, α-bergamotene] and 80 pmol of themicrosomes preparation (prepared as described in Example 8). Thereactions were incubated at 30° C. for 1 hour and stopped by adding 500μL hexane:ethyl acetate (85:15) followed by vortexing for 30 seconds.The organic layer was concentrated under a gentle stream of N₂ gas toabout 100 μL and analyzed by GC-MS analysis (as described in Example 6Aabove) or was stored at −80° C.

b. In Vitro Assay 2

Assays were performed in 400 μL reaction volumes containing 50 mMpotassium phosphate pH 7.5, 0.8 mM NADPH and 40 μM of substrate. Enzymereactions were initiated by adding 50 μL of the microsomes preparation(prepared in Example 8), incubated at 30° C. for 2 hours with shakingand stopped by adding 500 μL hexane. The organic layer was transferredto a new GC vial and concentrated under a gentle stream of N₂ gas toabout 100 μL and analyzed by GC-MS analysis (as described in Example 6Babove).

2. Clade I Santalum Album P450s

Microsomes containing clade I S. album P450s SaCYP76F39v1 (SaCYP76-G10),SaCYP76F39v2 (SaCYP76-G15), SaCYP76F40 (SaCYP76-G16), SaCYP76F41(SaCYP76-G17) and SaCYP76F42 (SaCYP76-G13) were assayed for theirsesquiterpene oxidize activity using the assays set forth above using amixture of santalenes and bergamotene as the substrate.

a. SaCYP76F39v1 (SaCYP76-G10)

The in vitro sesquiterpene oxidase activity Clade I S. album P450SaCYP76F39v1 (SaCYP76-G10) was assessed using both assays describedabove.

i. Initial Experiment Using In Vitro Assay 1

Microsomes containing SaCYP76F39v1 (SaCYP76-G10) were assayed for theiractivity using the assay described in Section B.1.a. above. GC-MSanalysis revealed eight different product peaks that were identified assantalols (see FIG. 9B, peaks correspond to those in Table 11 above).Product identification was based on best match of the MS fragmentationpatterns with entries in the NIST and Wiley libraries (Wiley Registry®9^(th) Edition/NIST 2011; Fred W. McLafferty, John Wiley & Sons, Inc.)and by comparison with compounds of authentic S. album oil (FIG. 9A) andKovats index values.

ii. Assay Using In Vitro Assay 2

Microsomes containing SaCYP76F39v1 (SaCYP76-G10) were assayed for theiractivity using the assay described in Section B. above. GC-MS analysisof the reaction products revealed that SaCYP76F39v1 (SaCYP76-G10)catalyzed the hydroxylation of α-santalene, β-santalene, epi-β-santaleneand α-trans-bergamotene, leading to 8 different compounds identified as(Z)- and (E)-α-santalol, (Z)- and (E)-β-santalol, (Z)- and(E)-epi-β-santalol and (Z)- and (E)-α-trans-bergamotol (see FIG. 15A andTable 12). The product profile was compared to an authentic sandalwoodoil sample (see Table 12 and FIG. 15B), which showed identical retentiontimes and mass spectra for all 8 compounds but in different ratios.SaCYP76F39v1 (SaCYP76-G10) produced (E)-α-santalol and (Z)-α-santalol ina ratio of approximately 5:1, and (E)-β-santalol and (Z)-β-santalol in aratio of approximately 4:1. The main products formed with SaCYP76F39v1(SaCYP76-G10) were (E)-α-santalol and (E)-β-santalol while the maincompounds of sandalwood oil are (Z)-α-santalol and (Z)-β-santalol. Noproduct was formed in the absence of NADPH or with microsomes from yeastcarrying an empty vector (see FIG. 15C).

b. SaCYP76F39v2 (SaCYP76-G15), SaCYP76F40 (SaCYP76-G16), SaCYP76F41(SaCYP76-G17) and SaCYP76F42 (SaCYP76-G13)

Microsomes containing SaCYP76F39v2 (SaCYP76-G15), SaCYP76F40(SaCYP76-G16), SaCYP76F41 (SaCYP76-G17) and SaCYP76F42 (SaCYP76-G13)were assayed for their activity using the assay described in Section B.above. GC-MS analysis of the reaction products revealed thatSaCYP76F39v2 (SaCYP76-G15), SaCYP76F40 (SaCYP76-G16), SaCYP76F41(SaCYP76-G17) and SaCYP76F42 (SaCYP76-G13) gave product profiles similarto those observed for SaCYP76F39v1 (SaCYP76-G10) (see Table 12 and FIGS.16A-16D). The major products observed for SaCYP76F40 (SaCYP76-G16) andSaCYP76F42 (SaCYP76-G13) were (E)-α-trans-bergamotol (or(E)-α-exo-bergamotol) and (E)-β-santalol.

3. Clade II S. Album P450s

Microsomes containing clade II S. album P450s SaCYP76F37v1(SaCYP76-G11), SaCYP76F38v2 (SaCYP76-G12), SaCYP76F37v2 (SaCYP76-G14)and SaCYP76F38v1 (SaCYP76-G5) were assayed for their sesquiterpeneoxidize activity using the assays set forth above using a mixture ofsantalenes and bergamotene as the substrate.

a. SaCYP76F37v1 (SaCYP76-G11) and SaCYP76F38v2 (SaCYP76-G12)

Microsomes containing SaCYP76F37v1 (SaCYP76-G11) and SaCYP76F38v2(SaCYP76-G12) were assayed for their activity using the assay describedin Section B.1.a. above. GC-MS analysis of the reaction productsrevealed one product peak that was absent in the control reaction(microsomes containing only vector control). The product peak wasidentified as Z-α-trans-bergamotol based on best match of its MSfragmentation pattern with entries in the NIST and Wiley libraries(Wiley Registry® 9^(th) Edition/NIST 2011; Fred W. McLafferty, JohnWiley & Sons, Inc.) and by comparison with compounds of authentic S.album oil.

b. SaCYP76F37v1 (SaCYP76-G11), SaCYP76F38v2 (SaCYP76-G12), SaCYP76F37v2(SaCYP76-G14) and SaCYP76F38v1 (SaCYP76-G5)

Microsomes containing SaCYP76F37v1 (SaCYP76-G11), SaCYP76F38v2(SaCYP76-G12), SaCYP76F37v2 (SaCYP76-G14) and SaCYP76F38v1 (SaCYP76-G5)were assayed for their activity using the assay described in SectionBib. above. GC-MS analysis of the reaction products revealed thatSaCYP76F37v1 (SaCYP76-G11), SaCYP76F38v2 (SaCYP76-G12), SaCYP76F37v2(SaCYP76-G14) and SaCYP76F38v1 (SaCYP76-G5) produced three compounds,which were identified as (E)-α-trans-bergamotol (or(E)-α-exo-bergamotol) as the major product, and (E)-α-santalol and(E)-β-santalol as minor products (see Table 12 and FIGS. 17A-17D).

4. SaCYP76F43 (SaCYP76-G18)

Microsomes containing SaCYP76F43 (SaCYP76-G18) were assayed for theiractivity using the assay described in Section B. above using a mixtureof santalenes and bergamotene as the substrate. No activity was observed(see FIG. 17E) possibly due to low expression in yeast as evidenced bythe corresponding CO difference spectrum (see FIG. 18).

C. Oxidation of Individual Sesquiterpenes

Microsome preparations containing candidate P450 were assayed for theircapacity to oxidize individual sesquiterpenes. The sesquiterpenes wereisolated as described in Example 9 above. Three fractions containingmainly α-santalene, α-trans-bergamotene, or epi-β-santalene andβ-santalene were used as individual substrates in assays containingclade I P450 SaCYP76F39v1 (SaCYP76-G10) or clade II P450 SaCYP76F37v1(SaCYP76-G11). The assays were performed as described in Section B.1.b.above and products were identified by comparison to authentic standards(see Table 12 and FIG. 20G).

Reaction of SaCYP76F39v1 (SaCYP76-G10) with α-santalene produced (Z)-and (E)-α-santalol while only (E)-α-santalol was produced withSaCYP76F37v1 (SaCYP76-G11) (see FIG. 20A versus FIG. 20D). Withα-trans-bergamotene, SaCYP76F39v1 (SaCYP76-G10) produced (Z)- and(E)-α-trans-bergamotol while only (E)-α-trans-bergamotol formation wasobserved for SaCYP76F37v1 (SaCYP76-G11) (see FIG. 20B versus FIG. 20E).SaCYP76F39v1 (SaCYP76-G10) gave four products (Z)- and(E)-epi-β-santalol and (Z)- and (E)-β-santalol in assays withepi-β-santalene and β-santalene whereas only (E)-β-santalol was detectedin assays with SaCYP76F37v1 (SaCYP76-G11) (see FIG. 20C versus FIG.20F). These results confirm the activities observed with microsome invitro assays with the mixture of santalenes and bergamotene (Section Babove).

Summary of Results from Examples 10 and 11 Clade I S. Album P450Santalene/Bergamotene Oxidases

Clade I S. album P450s SaCYP76F39v1 (SaCYP76-G10), SaCYP76F39v2(SaCYP76-G15), SaCYP76F40 (SaCYP76-G16), SaCYP76F41 (SaCYP76-G17) andSaCYP76F42 (SaCYP76-G13) catalyzed the oxidation of santalenes andbergamotene producing the (Z) and (E) stereoisomers of α-, β- andepi-β-santalols and bergamotols. The P450 ratios of (Z) and (E)stereoisomers of α- and β-santalol were approximately 1:5 and 1:4,respectively. Thus SaCYP76F39v1 (SaCYP76-G10), SaCYP76F39v2(SaCYP76-G15), SaCYP76F40 (SaCYP76-G16), SaCYP76F41 (SaCYP76-G17) andSaCYP76F42 (SaCYP76-G13) were identified as a santalene/bergamoteneoxidases.

Clade II S. Album P450 Bergamotene Oxidases

Clade II S. album P450s SaCYP76F37v1 (SaCYP76-G11), SaCYP76F38v2(SaCYP76-G12), SaCYP76F37v2 (SaCYP76-G14) and SaCYP76F38v1 (SaCYP76-G5)primarily catalyzed the oxidation of bergamotene into bergamotol, with(E)-α-trans-bergamotol as the major product and minor amounts of(E)-α-santalol and (E)-β-santalol observed. SaCYP76F37v1 (SaCYP76-G11),SaCYP76F38v2 (SaCYP76-G12), SaCYP76F37v2 (SaCYP76-G14) and SaCYP76F38v1were identified as bergamotene oxidases.

Example 12 Kinetic Properties

To test the kinetics of the clade I and clade II SaCYP76F enzymes,kinetic assays were performed with SaCYP76F37v1 (SaCYP76-G11) andSaCYP76F39v1 (SaCYP76-G10) with α-santalene or β-santalene as thesubstrate. Assays were performed in 400 μL reaction volumes containing50 mM potassium phosphate pH 7.5, 0.8 mM NADPH and substrateconcentrations of 12 to 138 μM of α-santalene or β-santalene. Enzymereactions were initiated by adding either 17 pmol of SaCYP7639v1 or 35pmol of SaCYP7637v1, incubated at 30° C. for 20 minutes with shaking andstopped by adding 500 μL hexane. The organic layer was transferred to anew GC vial and concentrated under a gentle stream of N₂ gas to about100 μL and analyzed by GC-MS analysis (as described in Example 6Babove). Kinetic data were evaluated using tools described in Hernandezand Ruiz ((1998) Bioinformatics. 14:227-228).

The apparent K_(m) values, k_(cat) values and k_(cat)/K_(m) values forSaCYP76F39v1 (SaCYP76-G10) and SaCYP76F37v1 (SaCYP76-G11) withα-santalene and β-santalene are set forth in Table 14 below.

TABLE 14 Kinetic constants for SaCYP76F39v1 and SaCYP76F37v1 α-santaleneP450 K_(m) (μM) k_(cat) (s⁻¹) k_(cat)/K_(m) (s⁻¹ M⁻¹) SaCYP76F39v1 25.92± 0.11 1.12 4.3 × 10⁴ (SaCYP76-G10) SaCYP76F37v1   133 ± 0.41 0.2 1.5 ×10³ (SaCYP76-G11) β-santalene P450 K_(m) (μM) k_(cat) k_(cat)/K_(m)SaCYP76F39v1 34.82 ± 0.41 1.17 3.3 × 10⁴ (SaCYP76-G10) SaCYP76F37v1  157 ± 0.17 0.13 8.1 × 10² (SaCYP76-G11)

Example 13 Substrate Specificity A. Substrate Specificity of Clade I andClade II SaCYP76F Enzymes

To test the range of substrates used by the clade I and clade IISaCYP76F enzymes, yeast microsomes containing SaCYP76F37v1 (SaCYP76-G11)and SaCYP76F39v1 (SaCYP76-G10) were assayed for their ability to convertvarious sesquiterpenes, including the substrates α-santalene andβ-santalene and 7 additional sesquiterpenes which resemble santalenes inthe acyclic isoprenyl side chain, including α-curcumene, zingiberine,β-bisabolene, β-sesquiphellandrene, α-bisabolol, trans-β-farnesene andtrans-nerolidol. Each substrate was tested using the in vitro assaydescribed in Example 11.B.1.b above.

The results are shown in Table 15 below, which sets forth thesubstrates, including their structures, and the relative activitieswhich represent the rate of product formation relative to productformation by SaCYP76F39v1 (SaCYP76-G10) with β-santalene. As shown inthe table, SaCYP76F39v1 (SaCYP76-G10) and SaCYP76F37v1 (SaCYP76-G11)exhibited narrow substrate selectivity, with both preferring santalenes,including α-santalene or β-santalene, as substrates. SaCYP76F39v1(SaCYP76-G10) efficiently converted only the two santalenes and had lowactivity with α-bisabolol. SaCYP76F39v1 (SaCYP76-G10) did not useα-curcumene, zingiberene, β-bisabolene, β-sesquiphellandrene,trans-β-farnesene or trans-nerolidol as a substrate. Similarly,SaCYP76F37v1 (SaCYP76-G11) was selectively active with the twosantalenes and trans-nerolidol.

TABLE 15 Relative activities of SaCYP76F39v1 and SaCYP76F37v1 withvarious sesquiterpene substrates. SaCYP76F39v1 SaCYP76F37v1(SaCYP76-G10) (SaCYP76-G11) Substrate [%]* [%]* α-santalene

99.8 17.3 β-santalene

100 17.7 α-curcumene

0 0 zingiberene

0 0 β-bisabolene

0 0 β-sesquiphellandrene

0 0 α-bisabolol

9.4 0 trans-farnesene

0 0 trans-nerolidol

0 11.3 *Relative activities represent rate of product formation relativeto product formation by SaCYP76F39v1 with β-santalene

B. Oxidation of Various Mono- and Sesquiterpenes Substrates

Yeast microsomes containing S. album cytochrome P450 SaCYP76F38v1(SaCYP76-G5) and cytochrome P450 reductase were directly assayed fortheir capacity to oxidize different mono- and sesquiterpene substrates,including linalool, geraniol, nerol, nerolidol and bisabolol. Thereaction mixtures contained 50 mM potassium phosphate, 0.8 mM NADPH and60 to 80 μM of the terpene substrate in a total volume of 350 μL. Enzymereactions were started by adding 50 μL of the microsome preparation,incubated at 30° C. for 1 hour with shaking and stopped by extractionwith 500 μL of hexane/ethyl acetate (85:15). The organic layer wasconcentrated under a gentle stream of N₂ gas to about 100 μL andanalyzed by GC-MS analysis as described in Example 6. Results werecompared to vector control. The reaction products were identified basedon matches of the MS fragmentation patterns with entries in the NIST andWiley libraries (Wiley Registry® 9^(th) Edition/NIST 2011; Fred W.McLafferty, John Wiley & Sons, Inc.).

1. SaCYP76F38v1 (SaCYP76-G5)

Reaction of SaCYP76F38v1 (SaCYP76-G5) with linalool resulted in twoproducts: Peak 1, retention time of at approximately 17.5 minutes andPeak 2, retention time of approximately 18.5 minutes. Linalool had aretention time of approximately 10.5. The best matches for the MSfragmentation patterns of Peaks 1 and 2 correspond to3,8-dimethyl-1,7-octadien-6-ol and 8-hydroxylinalool, respectively.Reaction of SaCYP76F38v1 (SaCYP76-G5) with geraniol resulted in oneproduct with a retention time of approximately 21 minutes. Geraniol hada retention time of approximately 14 minutes. The best match for thispeak's MS fragmentation pattern corresponds totrans,trans-2,6-dimethyl-2,6-octadiene-1,8 diol. Reaction ofSaCYP76F38v1 (SaCYP76-G5) with nerol resulted in one product with aretention time of approximately 20.8 minutes, whereas nerol had aretention time of approximately 13.4 minutes. The best match for thispeak's MS fragmentation pattern corresponds to2,6-dimethyl-2,6-octadiene-1,8 diol. Reaction of SaCYP76F38v1(SaCYP76-G5) with nerolidol resulted in two products, with retentiontimes of approximately 21.3 and 22.3 minutes, whereas nerolidol had aretention time of approximately 16.1 minute. Reaction of SaCYP76F38v1(SaCYP76-G5) with bisabolol resulted in one product having a retentiontime of approximately 25.2 with bisabolol having a retention time ofapproximately 17.6. The MS fragmentation patterns of products formed byreaction of SaCYP76F38v1 (SaCYP76-G5) with nerolidol and bisabolol didnot match with known substances in the MS fragmentation patterndatabases.

2. SaCYP76F39v1 (SaCYP76-G10)

CYP76-G10 also catalyzed the hydroxylation of linalool, nerol andbisabolol in vitro. In each case, product formation was the same as thatcatalyzed by CYP76-G5 described above.

Since modifications will be apparent to those of skill in this art, itis intended that this invention be limited only by the scope of theappended claims.

1. A host cell, comprising a nucleic acid molecule encoding a cytochromeP450 oxidase polypeptide or a catalytically active portion thereof,wherein: (a) the encoded cytochrome P450 oxidase polypeptide orcatalytically active portion thereof exhibits at least 85% sequenceidentity to a P450 oxidase polypeptide set forth in SEQ ID NO:6, 7, 8,9, 50, 74, 75, 76, or 77, or a corresponding catalytically activeportion thereof; (b) the encoded cytochrome P450 oxidase orcatalytically active fragment thereof catalyzes hydroxylation ormonooxygenation of santalene and/or bergamotene; and (c) the nucleicacid molecule is heterologous to the host cell.
 2. The host cell ofclaim 1, wherein the encoded cytochrome P450 oxidase polypeptideexhibits at least 95% sequence identity to the cytochrome P450 oxidasepolypeptide set forth in SEQ ID NO:6, 7, 8, 9, 50, 74, 75, 76, or
 77. 3.The host cell of claim 1, wherein the encoded cytochrome P450 oxidasepolypeptide or catalytically active portion thereof is a Santalum albumP450 oxidase polypeptide.
 4. The host cell of claim 1, wherein thecytochrome P450 oxidase polypeptide or catalytically active fragmentcatalyzes formation of a santalol from a santalene, or a bergamotol froma bergamotene.
 5. The host cell of claim 1, wherein the nucleic acidmolecule comprises a sequence of nucleotides selected from among: (a) asequence of nucleic acids set forth in any of SEQ ID NO:2, 3, 4, 5, 67,68, 69, 70, or 71; (b) a sequence of nucleic acids encoding a proteinhaving at least 85% sequence identity to a protein encoded by thesequence of nucleic acids set forth in any of SEQ ID NO:2, 3, 4, 5, 67,68, 69, 70, or 71; and (c) a sequence of nucleic acids comprisingdegenerate codons of one or more codons in the sequence of nucleic acidsof (a) or (b).
 6. The host cell of claim 5, wherein the sequence ofnucleic acid has at least 95% sequence identity to a sequence of nucleicacids set forth in any of SEQ ID NO:2, 3, 4, 5, 67, 68, 69, 70, or 71.7. The host cell of claim 1, comprising a nucleic acid molecule encodinga cytochrome P450 reductase or a catalytically active portion thereof,wherein: (a) the encoded cytochrome P450 reductase or catalyticallyactive portion thereof exhibits at least 95% sequence identity to acytochrome P450 reductase polypeptide set forth in SEQ ID NO:12 or 13;(b) the encoded cytochrome P450 reductase polypeptide or catalyticallyactive fragment thereof catalyzes transfer of two electrons from NADPHto an electron acceptor; and (c) the nucleic acid molecule isheterologous to the host cell.
 8. The host cell of claim 1, furthercomprising a nucleic acid molecule encoding a santalene synthasecomprising the sequence of amino acids set forth in any of SEQ ID NO:17,52, or 53, or a sequence of amino acids that is at least 95% identicalto any of SEQ ID NO:17, 52, or 53, or a catalytically active fragmentthereof.
 9. The host cell of claim 1, wherein the cell is a prokaryoticcell or an eukaryotic cell.
 10. The host cell of claim 1, wherein thecell produces farnesyl diphosphate natively or is modified to producemore farnesyl diphosphate compared to an unmodified cell.
 11. A hostcell, comprising: (a) a nucleic acid molecule encoding a cytochrome P450oxidase polypeptide or a catalytically active portion thereof, wherein:the encoded cytochrome P450 oxidase polypeptide or catalytically activeportion thereof exhibits at least 85% sequence identity to SEQ ID NO:6,7, 8, 9, 50, 74, 75, 76, or 77; and (ii) the encoded cytochrome P450oxidase polypeptide or catalytically active fragment thereof catalyzeshydroxylation or monooxygenation of santalene and/or bergamotene; (b) anucleic acid molecule encoding a cytochrome P450 reductase orcatalytically active portion thereof, wherein the encoded cytochromeP450 reductase or catalytically active portion thereof comprises thesequence of amino acids set forth in SEQ ID NO:12 or 13, or a sequenceof amino acids that has at least 95% sequence identity to a cytochromeP450 reductase polypeptide set forth in SEQ ID NO:12 or 13; and (c) anucleic acid molecule encoding a santalene synthase, wherein the encodedsantalene synthase comprises the sequence of amino acids set forth inany of SEQ ID NO:17, 52, or 53 or a sequence of amino acids that is atleast 95% identical to any of SEQ ID NO:17, 52, or 53 or a catalyticallyactive fragment thereof, wherein: at least one of the nucleic acidmolecules set forth in (a) or (b) is heterologous to the host cell. 12.The host cell of claim 11, wherein the cell is a prokaryotic cell or aneukaryotic cell.
 13. The host cell of claim 11, wherein the cellproduces farnesyl diphosphate natively or is modified to produce morefarnesyl diphosphate compared to an unmodified cell.
 14. An isolatednucleic acid molecule encoding a cytochrome P450 oxidase polypeptide ora catalytically active portion thereof, wherein: (a) the nucleic acidmolecule is cDNA; (b) the encoded cytochrome P450 oxidase polypeptide orcatalytically active portion thereof exhibits at least 85% sequenceidentity to SEQ ID NO:6, 7, 8, 9, 50, 74, 75, 76, or 77; and (c) theencoded cytochrome P450 oxidase or catalytically active fragment thereofcatalyzes hydroxylation or monooxygenation of santalene and/orbergamotene.
 15. The nucleic acid molecule of claim 14, wherein theencoded cytochrome P450 oxidase polypeptide exhibits at least 95%sequence identity to SEQ ID NO:6, 7, 8, 9, 50, 74, 75, 76, or
 77. 16. Avector, comprising the nucleic acid molecule of claim
 14. 17. A hostcell, comprising a vector of claim
 16. 18. A method for producing acytochrome P450 oxidase polypeptide or a catalytically active fragmentthereof, comprising: (a) culturing the cells of claim 1 under conditionssuitable for expression of the cytochrome P450 oxidase polypeptide; and(b) optionally isolating the cytochrome P450 oxidase polypeptide.
 19. Amethod for producing a santalol, bergamotol and/or mixtures thereof,comprising: (a) culturing a host cell of claim 11 under conditionssuitable for the formation of a santalol, bergamotol and/or mixturesthereof; wherein the host cell of claim 11 expresses the nucleic acidmolecules of part (a), (b) and (c); and (b) optionally isolating thesantalol, bergamotol and/or mixtures thereof.
 20. The host cell of claim1, wherein the encoded cytochrome P450 oxidase polypeptide orcatalytically active fragment thereof comprises a sequence of aminoacids set forth in SEQ ID NO:7.